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COMPOUNDS AND METHODS FOR DIAGNOSIS OF TUBERCULOSIS 

CROSS-REFERENCE TO RELATED APPLICATIONS 

This application is a continuation-in-part of U.S. Application 
5 No. 09/024,753, filed February 18, 1998; which is a continuation-in-part of 
U.S. Application No. 08/942,341, filed October 1, 1997; which is a continuation-in-part 
of U.S. Application No. 08/818,111, filed March 13, 1997, which is a continuation-in- 
part of U.S. Application No. 08/729,622 filed October 11, 1996; which claims pnonty 
from PCT Application No. PCTAJS 96/14675, filed August 30, 1996; and is a 

10 contmuation-in-pan of U.S. Application No. 08/680,574, filed July 12. 1996; which is a 
continuation-in-pan of U.S. Application No. 08/658,800 filed June 5, 1996; which is a 
continuation-in-part of U.S. Application No. 08/620,280, filed March 22, 1996, now 
abandoned; which is a continuation-in-part of U.S. Application No. 08/532,136, filed 
September 22, 1995, now abandoned; which is a continuation of U.S. Apphcation 

15 No. 08/523,435, filed September 1, 1995, now abandoned. 

TECHNICAL FIELD 

The present invention relates generally to the detection of 
Mycobacterium tuberculosis infection. The invention is more panicularly related to 
20 polypeptides compnsmg a Mycobacterium tuberculosis antigen, or a portion or other 
variant thereof, and the use of such polypeptides for the serodiagnosis of 
Mycobacterium tuberculosis infection. 

BACKGROUND OF THE INVENTION 
25 Tuberculosis is a chronic, infectious disease, that is generally caused by 

infection with Mycobacterium tuberculosis. It is a major disease in developing 
countries, as well as an increasing problem in developed areas of the world, with about 
8 million new cases and 3 million deaths each year. Although the infection may be 
asymptomatic for a considerable penod of lime, the disease is most commonly 
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manifested as an acute inflammation of the lungs, resulting in fever and a nonproductive 
cough. If left untreated, serious complications and death typically result. 

Although tuberculosis can generally be controlled using extended 
antibiotic therapy, such treatment is not sufficient to prevent the spread of the disease. 
5 Infected individuals may be asymptomatic, but contagious, for some time. In addition, 
although compUance with the treatment regimen is critical, patient behavior is difficult 
to monitor. Some patients do not complete the course of treatment, which can lead to 
ineffective treatment and the development of drug resistance. 

Inhibiting the spread of tuberculosis will require effective vaccination 

0 and accurate, early diagnosis of the disease. Currently, vaccination with live bactena is 
the most efficient method for inducing protective immunity. The most common 
Mycobacterium for this purpose is Bacillus Calmette-Guerin (BCG), an avirulent strain 
of Mycobacterium bovis. However, the safety and efficacy of BCG is a source of 
controversy and some countries, such as the United States, do not vaccinate the general 

5 public. Diagnosis is commonly achieved using a skin test, which involves intradermal 
exposure to tuberculin PPD (protein-purified derivative). Antigen-specific T cell 
responses result in measurable incubation at the injection site by 48-72 hours after 
mjection. which indicates exposure to Mycobactenal antigens. Sensitivity and 
specificity have, however, been a problem with this test, and individuals vaccinated 

0 with BCG cannot be distmguished from infected individuals. 

While macrophages have been shown to act as the principal effectors of 
M. tubercidosis immunity, T cells are the predominant inducers of such immunity. The 
essential role of T cells in protection agamst M tuberculosis infection is illustrated by 
the frequent occurrence of M, tuberculosis in AIDS patients, due to the depletion of 

5 CD4 T cells associated with human immunodeficiency virus (HIV) infection. 
Mycobacterium-reactive CD4 T cells have been shown to be potent producers of 
gamma-imerferon (IFN-y), which, in turn, has been shown to tngger the anti- 
mycobacterial effects of macrophages m mice. While the role of IFN-y m humans is 
less clear, smdies have shown that 1,25-dihydroxy-vitamm D3, either alone or m 

0 combination with EFN-y or tumor necrosis factor-alpha, activates human macrophages 
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to inhibit M tuberculosis infectioiL Furthennore, it is known that IFN-y stimuiates 
human macrophages to make U25-<iihydroxy- vitamin D3. Similarly, IL-12 has been 
shown to play a role m stimulating resistance to M. tuberculosis infection. For a review 
of the immunology of Af. tuberculosis infection see Chan and Kaufinann, in 
Tuberculosis: Pathogenesis. Protection and Control, Bloom (ed.), ASM Press, 
Washington, DC, 1994. 

Accordingly, there is a need in the art for improved diagnostic methods 
for detecting tuberculosis. The present invention fulfills this need and further provides 
other related advantages. 

SUMMARY OF THE INVENTION 

Briefly stated, the present invention provides compositions and methods 
for diagnosing tuberculosis. In one aspect, polypeptides are provided comprising an 
antigenic portion of a soluble M tuberculosis antigen, or a variant of such an antigen 
dial differs only in conservative substitutions and/or modifications. In one embodiment 
of this aspect, the soluble antigen has one of the following N-terminal sequences: 

(a) Asp-Pro-Val-Asp-AIa-Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly- 
Ghi- Val-Val- Ala- Ala-Leu (SEQ ID NO: 115); 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro- 
Ser(SEQIDNO: 116); 

<c) Ala-Ala-Mei-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala- 
Ala-Lys-Glu-Gly-Arg (SEQ ID NO: 1 17); 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Ghi-Pro-Phe-.Asp-Pro-.Ala-Trp-Gly- 
Pro(SEQIDNO: 118); 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Ghi-Gln-Xaa-Ala-Val 
(SEQ ID NO: 119); 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro {SEQ ID 
NO: 120); 

(g) Asp-Pro-Glu-Pro-.Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser- 
Pro-Pro-Ser (SEQ ID NO: 121); 
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(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr- 
Gly (SEQ ID NO: 122); 

(i) Asp-Pro-Aia-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Leu- 
Thr-Ser-Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe- 
Ala-Asn (SEQ ID NO: 123); 

(j) Xaa-Asp-Ser-Glu-Lys-Scr-Ala-Thr-De-Lys-Val-Thr-Asi>-AIa- 

Scr, (SEQ ID NO: 129) 
(k) Ala-Gly-Asp-Thr-Xaa-ne-Tyr-IIe-Vai-Gly-Asn-Leu-Thr-Ala- 

Asp; (SEQ ID NO: 130) or 
(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gb-Ala- 

Gly; (SEQ ID NO: 131) 

wherein Xaa may be any amino acid. 

In a related aspect, polypeptides arc provided comprising an 
mmnmogenic portion of an M tuberculosis antigen, or a variant of such an antigen that 
differs only in conservative substitutions and/or modifications, the antigen having one 
of the following N-terminal sequences: 

(m) Xaa-Tyr-ne-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys- 
Ile-Asn-Val-His-Leu-Val; (SEQ ID NO: 132) or 

(n) Asp-Pro-Pro- Asp-Pro-His-Gln-Xaa-Asp-iMet-Thr-Lys-Gly-Tyr- 
Tyr-Pro-Gly-Gly-.Axg-Arg-Xaa-Phe; (SEQ ID NO: 124) 
wherein Xaa may be any amino acid. 

In another embodiment, the soluble M tuberculosis antigen composes an 
amino acid sequence encoded by a DNA sequence selected from the group consistmg of 
the sequences recited in SEQ ID NOS: 1, 2, 4-10, 13-25, 52, 94 and 96, the 
complements of said sequences, and DNA sequences that hybridize to a sequence 
recited in SEQ ID NOS: 1, 2, 4-10, 13-25, 52, 94 and 96 or a complement thereof under 
moderately stringent conditions. 

In a related aspect, the polypeptides compnse an antigenic portion of a 
M. tuberculosis antigen, or a variant of such an antigen that differs only m conseivative 
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substitutions and/or modifications, wherein the antigen comprises an amino acid 
sequence encoded by a DNA sequence selected fixim the group consisting of the 
sequences recited in SEQ ID NOS: 26-51, 133, 134, 158-178, 184-188, 194-196, 198, 
210-220, 232, 234, 235, 237-242, 248-251, 256-271, 287, 288, 290-293 and 298-337, , 

5 the complements of said sequences, and DNA sequences that hybridize to a sequence 
recited in SEQ ID NOS: 26-51, 133, 134, 158-178, 184-188, 194-196, 198, 210-220, 
232, 234, 235, 237-242, 248-251, 256-271, 287, 288, 290-293 and 298-337, or a 
complement thereof under moderately stringent conditions. 

In related aspects, DNA sequences encoding the above polypeptides, 

10 recombinant expression vectors compnsmg these DNA sequences and host cells 
transformed or transfected with such expression vectors are also provided. 

In another aspect, the present invention provides fusion proteins 
comprising a first and a second inventive polypeptide or, alternatively, an inventive 
polypeptide and a known M tuberculosis antigen. 

15 In further aspects of the subject invention, methods and diagnostic kits 

are provided for detecting tuberculosis in a patient. The methods comprise: 

(a) contacting a biological sample with at least one of the above polypeptides; and 

(b) detectmg m the sample the presence of antibodies that bind to the polypeptide or 
polypeptides, thereby detectmg M. tuberculosis infection m the biological sample. 

20 Suitable biological samples include whole blood, sputum, serum, plasma, sahva. 
cerebrospinal fluid and unne. The diagnostic kits comprise one or more of the above 
polypeptides in combinanon with a detection reagent. 

The present invention also provides methods for detectmg 
M. tuberculosis infection comprising: (a) obtaining a biological sample from a patient; 

25 (b) contacting the sample with at least one ohgonucleotide primer in a polymerase 
chain reaction, the ohgonucleotide pnmer being specific for a DNA sequence encoding 
the above polypeptides; and (c) detecting m the sample a DNA sequence that amplifies 
in the presence of the first and second oligonucleotide pnmers. hi one embodiment, the 
oligonucleotide primer comprises at least about 10 contiguous nucleotides of such a 

30 DNA sequence. 
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In a further aspect, the present invention provides a method for detecting 
M tuberculosis infection in a patient comprising: (a) obtaining a biological sample 
from the patient; (b) contacting the sample with an oligonucleotide probe specific for a 
DNA sequence encoding the above polypeptides; and (c) detecting in the sample a DNA 
5 sequence that hybridizes to the oligonucleotide probe. In one embodiment, the 
oUgonucleotide probe comprises at least about 15 contiguous nucleotides of such a 
DNA sequence. 

In yet another aspect, the present invention provides antibodies, both 
polyclonal and monoclonal, that bind to the polypeptides described above, as well as 
10 methods for their use in the detection of M tuberculosis infection. 

These and other aspects of the present invention will become apparent 
upon reference to the following detailed description and attached drawings. All 
references disclosed herein are hereby incorporated by reference in their entirety as if 
each was incorporated individually. 

15 

BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCE IDENTIFIERS 

Figure 1 A and B illustrate the stimulation of proliferation and interferon- 

y production m T cells derived from a first and a second M tuberculosis-\rt)mmQ donor, 

respectively, by the 14 Kd, 20 Kd and 26 Kd antigens descnbed in Example 1. 
20 Figures 2A-D illustrate the reactivity of antisera raised against secretory 

M, tuberculosis protems, the known M. tuberculosis antigen S5b and the inventive 

antigens Tb38-1 and TbH-9, respectively, with M. tuberculosis lysate (lane 2), M. 

tuberculosis secretory protems (lane 3), recombmani Tb38-1 (lane 4), recombinant 

TbH-9 (lane 5) and recombinant 85b (lane 5), 
25 Figure 3A illustrates the stimulation of proliferation in a TbH-9-specific 

T cell clone by secretory M. tuberculosis proteins, recombmant TbH-9 and a control 

antigen, TbRall. 

Figure 3B illustrates the stimulation of mterferon-y production in a TbH- 
9-specific T cell clone by secretory M. tuberculosis protems, PPD and recombmant 
30 TbH-9. 
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Figure 4 illustrates the reactivity of two representative polypeptides with 
sera from M tuberculosis-mf^tcd and uninfected individuals, as compared to the 
reactivity of bacterial lysate. 

Figure 5 shows the reactivity of four representative polypeptides with 
5 sera from M. tuberculosis-mfcct^d and uninfected individuals, as compared to the 
reactivity of the 38 kD antigen. 

Figure 6 shows the reactivity of recombinant 38 kD and TbRal 1 
antigens with sera from M tuberculosis patients, PPD positive donors and normal 
donors. 

10 Figure 7 shows the reactivity of the antigen TbRa2A with 38 kD 

negative sera. 

Figure 8 shows the reactivity of the antigen of SEQ ID NO: 60 with sera 
from M. tuberculosis patients and normal donors. 

Figure 9 illustrates the reactivity of the recombinant antigen TbH-29 
15 (SEQ ID NO: 137) with sera from M. tuberculosis patients, PPD positive donors and 
normal donors as determined by indirect ELISA. 

Figure 10 illustrates the reactivity of the recombinant antigen TbH-33 
(SEQ ED NO: 140) with sera from M, tuberculosis patients and from normal donors, and 
with a pool of sera from M. tuberculosis patients, as determined both by direct and 
20 indfrect ELISA 

Figure 1 1 illustrates the reactivity of increasing concentrations of the 
recombmant antigen TbH-33 (SEQ ID NO: 140) with sera from M. tuberculosis patients 
and from normal donors as determined by ELISA. 

Figures 12A-E illustrate the reactivity of the recombinant antigens MO- 
25 1, MO-2, MO-4, MO-28 and MO-29, respectively, with sera from K tuberculosis 
patients and from normal donors as determined by ELISA. 

SEQ. ID NO. 1 IS the DNA sequence of TbRal. 
SEQ. ED NO. 2 is the DNA sequence of TbRalO. 
30 SEQ. ID NO. 3 IS the DNA sequence of TbRal 1 . 
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SEQ. ID NO. 
SEQ. ID NO. 
SEQ. ID NO. 
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SEQ. ED NO. 
SEQ. ID NO. 
SEQ. ID NO. 
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SEQ. ID NO. 
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SEQ. ID NO. 
SEQ. ED NO. 
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SEQ. ID NO. 
SEQ. ID NO 
SEQ. ID NO 
SEQ. ID NO 



4 is the DNA sequence of TbRal2. 

5 is the DNA sequence of TbRal3. 

6 is the DNA sequence of TbRal6. 

7 is the DNA sequence of TbRal7. 

8 is the DNA sequence of TbRalS. 

9 is the DNA sequence of TbRaI9. 



10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
">2 

23 
24 
25 
26 
27 
28 
29 
30 
.31 



s the DNA sequence of TbRa24. 
s the DNA sequence of TbRa26. 
s the DNA sequence of TbRa28. 
s the DNA sequence of TbRa29. 
s the DNA sequence of TbRa2A. 
s the DNA sequence of TbRa3. 
s the DNA sequence of TbRa32. 
s the DNA sequence of TbRa35. 
s the DNA sequence of TbRa36. 
s the DNA sequence of TbRa4. 
s the DNA sequence of TbRa9. 
s the DNA sequence of TbRaB. 
s the DNA sequence of TbRaC. 
s the DNA sequence of TbRaD. 
s the DNA sequence of YYWCPG. 
s the DNA sequence of AAMK. 
s the DNA sequence of TbL-23. 
s the DNA sequence of TbL-24. 
s the DNA sequence of TbL-25. 
s the DNA sequence of TbL-28. 
s the DNA sequence of TbL-29. 
s the DNA sequence of TbH-5. 
s the DNA sequence of TbH-8. 
s the DNA sequence of TbH-9. 
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SEQ. 


ID NO. 




IS the DNA sequence of TbM-1. 




SEQ. 


ED NO. 




IS the DNA sequence of TbM-3. 




SEQ. 


ID NO. 




IS the DNA sequence of TbM-6. 




SEQ. 


ED NO. 


37 


IS the DNA sequence of TbM-7. 


5 


SEQ. 


ID NO. 


38 


IS the DNA sequence of TbM-9. 




SEQ. 


ID NO. 


39 


is the DNA sequence of TbM-12. 




SEQ. 


ID NO. 


40 


IS the DNA sequence of TbM-13. 




SEQ. 


ID NO. 


4 1 

41 


* ~ ^t- — T^X T A i7 Tt- XX 1 J* 

IS the DNA sequence of TbM-i4. 




SEQ. 


ID NO. 


42 


* _ TXX T A ^ TT^ XX 1 ^ 

IS the DNA sequence of TbM-1 5. 


10 


SEQ. 


ID NO. 


4^ 


IS the DNA sequence of TbH-4. 




SEQ. 


ID NO. 


1 A 

44 


IS the DNA sequence ot TbH«4-F WD. 




SEQ. 


ID NO. 


4:) 


IS the DNA sequence of TbH-12. 




SEQ. 


ED NO. 


46 


IS the DNA sequence of Tb38-1. 




SEQ. 


ID NO. 


4/ 


I _ TXXT A .£* T^T_ T O A 

IS the DNA sequence of Tb38-4. 


15 


SEQ. 


ID NO. 


1 o 

48 


' _ — TNX T A T IT 

IS the DNA sequence of TbL-17. 




ShQ. 


ID NO. 


49 


IS the DNA sequence of TbL-20. 




SEQ. 


rr\ xTo 
ID NO. 


DO 


' _ _ TXX TA ^J? ' I 'L T '^1 

is the DNA sequence of TdL-21. 




iiEQ. 


ID NO. 


:?1 


IS the DNA sequence of TbH-16. 




SEQ. 


ID NO. 




IS the DNA sequence ot DPEP. 


20 


SEQ. 


TT~\ XT/^ 

ID NO. 


53 


IS the deduced ammo acid sequence of DPEP. 




SEQ. 


ID NO. 


d4 


IS the protein sequence of DPV N-temunal Anugen. 




SEQ. 


ID NO. 


55 


is the protein sequence ot AVGS N-tenmnal Antigen. 




SEQ. 


ID NU. 


30 


IS the protein sequence of AAMK N-tcnninal Antigen. 




SEQ. 


ID NU. 


57 


IS the protein sequence of YYwC N-tcrminal Antigen. 


25 


SEQ. 


ID NO. 


58 


is the protein sequence of DIGS N-terminal Antigen. 




SEQ. 


ID NO. 


59 


is the protein sequence of .\EES N-terminal Antigen. 




SEQ. 


ID NO. 


60 


is the protein sequence of DPEP N-terminal Antigen. 




SEQ. 


ID NO. 


61 


is the protein sequence of APKT N-terminal Antigen. 




SEQ. 


ID NO. 


62 


is the protein sequence of DP AS N-terminal Antigen. 


30 


SEQ. 


ID NO. 


63 


is the deduced amino acid sequence of TbM-1 Peptide 
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SEQ. ID NO. 64 is the deduced amino acid sequence of TbRaL 
SEQ. ED NO. 65 is the deduced amino acid sequence of TbRalO. 
SEQ. ID NO. 66 is the deduced amino acid sequence of TbRal 1. 
SEQ, ED NO. 67 is the deduced amino acid sequence of TbRaI2. 
SEQ. ID NO- 68 is the deduced amino acid sequence of TbRal 3. 
SEQ. ID NO. 69 is the deduced amino acid sequence of TbRal 6. 
SEQ. ID NO. 70 is the deduced amino acid sequence of TbRal 7. 
SEQ. ID NO. 71 is the deduced amino acid sequence of TbRalS. 
SEQ. ID NO. 72 is the deduced amino acid sequence of TbRal 9, 
SEQ. ID NO- 73 is the deduced amino acid sequence of TbRa24, 
SEQ. ID NO- 74 is the deduced ammo acid sequence of TbRa26. 
SEQ. ID NO. 75 is the deduced amino acid sequence of TbRa28. 
SEQ. ID NO. 76 is the deduced amino acid sequence of TbRa29. 
SEQ. ID NO. 77 is the deduced amino acid sequence of TbRa2A. 
SEQ. ED NO. 78 is the deduced amino acid sequence of TbRa3. 
SEQ. ED NO. 79 is the deduced amino acid sequence of TbRa32. 
SEQ. ID NO. 80 is the deduced amino acid sequence of TbRa35. 
SEQ- ID NO. 81 is the deduced amino acid sequence of TbRa36. 
SEQ. ID NO. 82 is the deduced amino acid sequence of TbRa4. 
SEQ. ID NO. 83 is the deduced amino acid sequence of TbRa9. 
SEQ. ED NO. 84 is the deduced amino acid sequence of TbRaB. 
SEQ. ED NO. 85 is the deduced ammo acid sequence of TbRaC. 
SEQ. ED NO- 86 is the deduced amino acid sequence of TbRaD. 
SEQ. ID NO- 87 is the deduced amino acid sequence of YYWCPG. 
SEQ. ID NO. 88 is the deduced amino acid sequence of TbAAMK. 
SEQ. ID NO. 89 is the deduced amino acid sequence of Tb38-1. 
SEQ. DD NO. 90 is the deduced amino acid sequence of TbH-4. 
SEQ. ED NO. 91 is the deduced amino acid sequence of TbH-8. 
SEQ. ED NO. 92 is the deduced amino acid sequence of TbH-9. 
SEQ. ID NO. 93 is ±e deduced amino acid sequence of TbH-12. 
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SEQ. ID NO. 94 is the DNA sequence of DPAS. 

SEQ. ID NO. 95 is the deduced amino acid sequence of DPAS. 

SEQ. ID NO. 96 is the DNA sequence of DPV. 

SEQ. ID NO. 97 is the deduced amino acid sequence of DPV. 

SEQ. ID NO. 98 is the DNA sequence of ESAT-6. 

SEQ. ED NO. 99 is the deduced amino acid sequence of ESAT-6. 

SEQ, ID NO. 100 is the DNA sequence of TbH-8-2. 

SEQ. ID NO. 101 is the DNA sequence of TbH-9FL. 

SEQ. ID NO. 102 is the deduced amino acid sequence of TbH.9FL. 

SEQ. ID NO. 103 is the DNA sequence of TbH-9-1. 

SEQ. ED NO. 104 is the deduced amino acid sequence of TbH-9-1. 

SEQ. ID NO. 105 is the DNA sequence of TbH-9-4. 

SEQ. ED NO. 106 is the deduced amino acid sequence of TbH-9-4. 

SEQ. ID NO. 107 is the DNA sequence of Tb38.1F2 IN. 

SEQ. ED NO. 108 is the DNA sequence of Tb38-1F2 RP. 

SEQ. ID NO. 109 is the deduced amino acid sequence of Tb37-FL. 

SEQ. ID NO. 110 is the deduced amino acid sequence of Tb38-IN. 

SEQ. ID NO. 1 1 1 is the DNA sequence of Tb38-1F3. 

SEQ. ID NO. 1 12 is the deduced amino acid sequence of Tb38-1F3. 

SEQ. ID NO. 1 13 is the DNA sequence of Tb38-1F5. 

SEQ. ID NO. 1 14 is the DNA sequence of Tb38-1F6. 

SEQ. ED NO. 1 15 is the deduced N-terminal amino acid sequence of DPV. 

SEQ. ID NO. 1 16 is the deduced N-temiinal amino acid sequence of AVGS. 

SEQ. ID NO. 1 17 is the deduced N-temiinai amino acid sequence of AAMK. 

SEQ. ID NO. 1 1 8 is the deduced N-tcrminal amino acid sequence of YYWC. 

SEQ. ID NO. 119 is the deduced N-temunal amino acid sequence of DIGS. 

SEQ. ED NO. 120 is the deduced N-iemiinal amino acid sequence of AAES. 

SEQ. ID NO. 121 is the deduced N-temiinal amino acid sequence of DPEP. 

SEQ. ID NO. 122 is the deduced N-temunal amino acid sequence of .APKT. 

SEQ. ED NO. 123 is the deduced N-temiinal amino acid sequence of DP.\S. 
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SEQ. ID NO. 124 is the protein sequence of DPPD N-tenninal Antigen. 

SEQ E) NO. 125-128 are the protein sequences of four DPPD cyanogen 

bromide fragments. 

SEQ ID NO. 129 is the N-tenninal protem sequence of XDS antigen, 
5 SEQ ED NO. 130 is the N-terminal protein sequence of AGD antigen. 

SEQ ID NO. 131 is the N-terminal protem sequence of APE antigen. 

SEQ ID NO. 132 is the N-terminal protein sequence of XYI antigen. 

SEQ ID NO. 133 is the DNA sequence of TbH-29. 

SEQ ID NO. 134 is the DNA sequence of TbH-30. 
10 SEQ ID NO. 135 is the DNA sequence of TbH-32. 

SEQ ID NO. 136 is the DNA sequence of TbH-33. 

SEQ ID NO. 137 is the predicted amino acid sequence of TbH-29. 

SEQ ID NO. 138 is the predicted amino acid sequence of TbH-30. 

SEQ ID NO. 139 is the predicted amino acid sequence of TbH-32. 
15 SEQ ID NO. 140 is the predicted ammo acid sequence of TbH-33. 

SEQ ED NO: 141-146 are PGR primers used in the preparation of a fusion 

protein containing TbRa3, 38 kD and Tb38-1. 

SEQ ID NO: 147 is the DNA sequence of the fusion protein comaimng TbRa3, 
38kDand Tb38-1. 

-0 SEQ ID NO: 148 is the ammo acid sequence of the fusion protem comaimng 

TbRa3, 38 kD and Tb38-1. 

SEQ ED NO: 149 is the DNA sequence of the M. tuberculosis antigen 38 kD. 
SEQ ID NO: 150 is the ammo acid sequence of the M. tuberculosis antigen 38 
kD. 

25 SEQ ID NO: 151 is the DNA sequence of XP14. 

SEQ ID NO: 152 is the DNA sequence of XP24. 

SEQ ID NO: 153 is the DNA sequence of XP31. 

SEQ ID NO: 154 is the 5^ DNA sequence of XP32. 

SEQ ED NO: 155 is the 3' DNA sequence of XP32. 
30 SEQ ED NO: 156 is the predicted ammo acid sequence of XP14. 
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SEQ ED NO: 157 is the predicted amino acid sequence encoded by the reverse 
complement of XP14. 

SEQ ID NO: 158 is the DNA sequence of XP27. 

SEQ ED NO: 159 is the DNA sequence of XP36. 
5 SEQ ED NO: 160 is the 5' DNA sequence of XP4. 

SEQ ID NO: 161 is the 5' DNA sequence of XP5- 

SEQIDNO: 162 is the 5' DNA sequence of XP 17. 

SEQ ED NO: 163 is the 5' DNA sequence of XP30. 

SEQ ID NO: 164 is the 5' DNA sequence of XP2. 
10 SEQ ID NO: 165 is the 3' DNA sequence of XP2. 

SEQ ID NO: 166 is the 5' DNA sequence of XP3. 

SEQ ID NO: 167 is the 3' DNA sequence of XP3. 

SEQ ID NO: 168 is the 5' DNA sequence of XP6. 

SEQ ED NO: 169 is the 3' DNA sequence of XP6. 
15 SEQ ID NO: 170 is the 5' DNA sequence of XP 18. 

SEQ ED NO: 171 is the 3' DNA sequence of XP18. 

SEQ ID NO: 172 is the 5' DNA sequence of XP19. 

SEQ ID NO: 173 is the 3' DNA sequence of XP19, 

SEQ ID NO: 174 is the 5' DNA sequence of XP22. 
20 SEQ ID NO: 175 is the 3' DNA sequence of XP22. 

SEQ ED NO: 176 is the 5^ DNA sequence of XP25. 

SEQ ID NO: 177 is the 3* DNA sequence of XP25. 

SEQ ED NO: 178 is the full-length DNA sequence of TbH4-XPL 

SEQ ED NO: 1 79 is the predicted amino acid sequence of TbH4-XPl . 
25 SEQ ED NO: 180 is the predicted ammo acid sequence encoded by the reverse 

complement of TbH4-XPl. 

SEQ ED NO: 181 is a first predicted ammo acid sequence encoded by XP36. 
SEQ ED NO: 1 82 is a second predicted amino acid sequence encoded by XP36. 
SEQ ID NO: 183 is the predicted ammo acid sequence encoded by the reverse 
30 complement of XP36. 
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SEQ ID NO: 
SEQ ED NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



184 is the DNA sequence of RDIF2. 

185 is the DNA sequence of RDIF5. 

186 is the DNA sequence of RDEFS. 

187 is the DNA sequence of RDDFIO. 

1 88 is the DNA sequence of RDIFl 1 . 



SEQ ID NO: 189 is the predicted amino acid sequence of RDIF2. 
SEQ ID NO: 190 is the predicted amino acid sequence of RDIF5. 
SEQ ID NO: 191 is the predicted amino acid sequence of RDIF8. 
SEQ ED NO: 192 is the predicted amino acid sequence of RDIFIO. 
SEQ ID NO: 193 is the predicted amino acid sequence of RDIFll. 
SEQ ID NO: 194 is the 5' DNA sequence of RDIF12. 
SEQ ID NO: 195 is the 3' DNA sequence of RDIF12. 
SEQ ID NO: 196 is the DNA sequence of RDIF7. 
SEQ ID NO: 197 is the predicted amino acid sequence of RDEF7. 



SEQ ED NO: 199 is the predicted amino acid sequence of DEF2-1. 

SEQ ID NO: 200-207 are PGR primers used in the preparation of a fusion 

protem coniaimng TbRaS. 38 kD, Tb38-1 and DPEP (heremafter referred to as 

TbF-2). 

SEQ ID NO: 208 is the DNA sequence of the fusion protein TbF-2. 

SEQ ID NO: 209 is the ammo acid sequence of the fusion protein TbF-2. 

SEQ CD NO: 210 is the 5' DNA sequence of MO-I. 

SEQ ID NO: 211 IS the 5' DNA sequence for MO-2 

SEQ ED NO: 212 is the 5' DNA sequence for MO-4. 

SEQ ID NO: 213 is the 5' DNA sequence for MO-8. 

SEQ ID NO: 214 is the 5' DNA sequence for MO-9. 

SEQ ID NO: 215 is the 5' DNA sequence for MO-26. 

SEQ ED NO: 216 is the 5' DNA sequence for MO-28. 

SEQ ID NO: 217 is the 5^ DNA sequence for MO-29. 



15 



SEQ ED NO: 



198 is the DNA sequence of DEF2-1. 



30 



SEQ ID NO 



218 IS the 5' DNA sequence for MO-30. 
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10 



15 



20 



30 



SEQ ID NO: 219 
SEQ ID NO: 220 
SEQ ID NO: 221 
SEQ ID NO: 222 
SEQ ID NO: 223 
SEQ ID NO: 224 
SEQ ID NO: 225 
SEQ ID NO: 226 
SEQ ID NO: 227 
SEQ ID NO: 228 
SEQ ID NO: 229 
SEQ ID NO: 230 
SEQ ID NO: 231 
SEQ ID NO: 232 
SEQ ID NO: 233 
SEQ ID NO: 234 
SEQ ID NO: 235 
SEQ ID NO: 236 
SEQ ID NO: 237 
SEQ ID NO: 238 
SEQ ID NO: 239 
SEQ ID NO: 240 
SEQ ID NO: 241 
SEQ ID NO: 242 
SEQ ID NO: 243 
SEQ ID NO: 244 
SEQ ID NO: 245 
SEQ ID NO: 246 
SEQ ID NO: 247 
SEQ ED NO: 248 



s the 5' DNA sequence for MO-34. 

s the 5' DNA sequence for MO-35. 

s the predicted amino acid sequence for MO-1. 

s the predicted amino acid sequence for MO-2. 

s the predicted amino acid sequence for MO-4. 

s the predicted amino acid sequence for MO-8. 

s the predicted amino acid sequence for MO-9. 

s the predicted amino acid sequence for MO- 26. 

s the predicted amino acid sequence for MO-28. 

s the predicted amino acid sequence for MO-29. 

s the predicted amino acid sequence for MO-30. 

s the predicted amino acid sequence for MO-34. 

s the predicted amino acid sequence for MO-35. 

s the determined DNA sequence for MO- 10. 

s the predicted amino acid sequence for MO- 10. 

s the 3' DNA sequence for MO-27. 

s the fiiU-length DNA sequence for DPPD. 

s the predicted full-length amino acid sequence for DPPD 

s the determmed 5' cDNA sequence for LSER-10 

s the determmed 5' cDNA sequence for LSER-1 1 

s the determined 5* cDNA sequence for LSER-1 2 

s the determmed 5' cDNA sequence for LSER-1 3 

s the determined 5' cDNA sequence for LSER-1 6 

s the determmed 5' cDNA sequence for LSER-25 

s the predicted amino acid sequence for LSER-10 

s the predicted amino acid sequence for LSER-1 2 

s the predicted amino acid sequence for LSER-1 3 

s the predicted amino acid sequence for LSER-16 

s the predicted amino acid sequence for LSER-25 

s the determmed cDNA sequence for LSER-1 8 
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SEQ 


ID 


NO: 


249 


IS the detcimineci cDNA sequence for LSER-2j 




SEQ 


ID 


NO: 


250 


IS the determined cDNA sequence for LSER-24 




SEQ 


ID 


NO: 


251 


IS the determined cDNA sequence for LSER-2 / 




SEQ 


ID 


NO: 


252 


IS the predicted ammo acid sequence for LSER-18 


5 


SEQ 


ID 


NO: 


253 


IS the predicted ammo acid sequence for LSER-23 




SEQ 


ID 


NO: 


254 


IS the predicted ammo acid sequence for LSER-24 




SEQ 


ID 


NO: 


255 


IS the predicted ammo acid sequence for LSER-27 




SEQ 


ID 


NO: 


256 


IS the determmed 5 cDNA sequence for LSER-l 




SEQ 


ID 


NO: 


257 


IS the determmed 5 cDNA sequence for LSERo 


10 


SEQ 


ED 


NO: 


258 


IS the deierminea d cDNA sequence for LSER-4 




SEQ 


ED 


NO: 


259 


IS the determmed 5 cDNA sequence tor LSER-^ 




SEQ 


ID 


NO: 


260 


IS the determmed 5 cDNA sequence for LSER-6 




SEQ 


ED 


NO: 


261 


is the determined 5' cDNA sequence for LSER-8 




SEQ 


ID 


NO: 


262 


is the determined 5' cDNA sequence for LSER-l 4 


15 


SEQ 


ID 


NO: 


263 


IS the determmed 5 cDNA sequence for LSER-l 5 




SEQ 


ID 


NO: 


264 


is the determined 5 cDNA sequence for LSER-l / 




SEQ 


ID 


NO: 


265 


is the determmed 5' cDNA sequence for LSER-l 9 




SEQ 


ID 


NO: 


266 


is the determined 5' cDNA sequence for LSER-20 




SEQ 


ED 


NO: 


267 


is the determined :3 cDNA sequence for LSER-^2 


20 


SEQ 


ED 


NO: 


268 


IS the determined d cDNA sequence for LSER-26 




SEQ 


ID 


NO: 


269 


IS the determmed o cDNA sequence for LSER-28 




SEQ 


ID 


NO: 


270 


is the determmed 5' cDNA sequence for LSER-29 




SEQ 


ED 


NO: 


271 


is the determined 5 cDNA sequence for LSER-jO 




SEQ 


ID 


NO: 


272 


is the predicted amino acid sequence for LSER- 1 


25 


SEQ 


ID 


NO: 


273 


is the predicted amino acid sequence for LSER-3 




SEQ 


ID 


NO: 


274 


is the predicted amino acid sequence for LSER-5 




SEQ 


ID 


NO: 


275 


is the predicted amino acid sequence for LSER-6 




SEQ 


ID 


NO: 


276 


is the predicted amino acid sequence for LSER-8 




SEQ 


ED 


NO: 


277 


is the predicted ammo acid sequence for LSER- 14 


30 


SEQ 


ID 


NO: 


278 


is the predicted amino acid sequence for LSER- 15 
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SEQ ED NO: 279 is the predicted amino acid sequence for LSER-17 

SEQ ID NO: 280 is the predicted amino acid sequence for LSER-19 

SEQ ID NO: 281 is the predicted amino acid sequence for LSER-20 

SEQ ED NO: 282 is the predicted amino acid sequence for LSER-22 

SEQ ID NO: 283 is the predicted amino acid sequence for LSER-26 

SEQ ID NO: 284 is the predicted amino acid sequence for LSER-28 

SEQ ID NO: 285 is the predicted amino acid sequence for LSER-29 

SEQ ID NO: 286 is the predicted amino acid sequence for LSERoO 

SEQ ED NO: 287 is the determined cDNA sequence for LSER-9 

SEQ ID NO: 288 is the determined cDNA sequence for the reverse complement 

ofLSER-6 

SEQ ED NO: 289 is the predicted amino acid sequence for the reverse 
complement of LSER-6 

SEQ ID NO: 290 is the determmed 5' cDNA sequence for MO- 1 2 
SEQ ED NO: 291 is the determmed 5' cDNA sequence for MO-13 
SEQ ID NO: 292 is the determined 5' cDNA sequence for MO- 19 
SEQ ID NO: 293 is the determined 5' cDNA sequence for MO-39 
SEQ ED NO: 294 is the predicted amino acid sequence for M0-i2 
SEQ ED NO: 295 is the predicted amino acid sequence for MO-13 
SEQ ID NO: 296 is the predicted ammo acid sequence for MO- 19 
SEQ ED NO: 297 is the predicted ammo acid sequence for MO-39 
SEQ ED NO: 298 is the determined 5' cDNA sequence for Erdsn-1 
SEQ ED NO: 299 is the determmed 5' cDNA sequence for Erdsn-2 
SEQ ID NO: 300 is the determined 5' cDNA sequence for Erdsn-4 
SEQ ID NO: 301 is the detennined 5' cDNA sequence for Erdsn-5 
SEQ ID NO: 302 is the determined 5^ cDNA sequence for Erdsn-6 
SEQ ID NO: 303 is the determined 5^ cDNA sequence for Erdsn-7 
SEQ ID NO: 304 is the determined 5' cDNA sequence for Erdsn-8 
SEQ ID NO: 305 is the determined 5^ cDNA sequence for Erdsn-9 
SEQ ED NO: 306 is the determined 5' cDNA sequence for Erdsn-10 
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SEQ ID NO: 307 is the determined 
SEQ ID NO: 308 is the detennined 
SEQ ID NO: 309 is the determined 
SEQ ID NO: 310 is the determined 
SEQ ID NO: 3 1 1 is the determined 
SEQ ID NO: 312 is the determined 
SEQ ID NO: 313 is the determined 
SEQ ID NO: 314 is the determined 
SEQ ID NO: 315 is the determined 
SEQ ID NO: 316 is the determined 
SEQ ID NO: 317 is the determmed 
SEQ ID NO: 318 is the determined 
SEQ ID NO: 319 is the determined 
SEQ ID NO: 320 is the determined 
SEQ ID NO: 321 is the determined 
SEQ ID NO: 322 is the determined 
SEQ ED NO: 323 is the determined 
SEQ ID NO: 324 is the determined 
SEQ ID NO: 325 is the determined 
SEQ ID NO: 326 is the determined 
SEQ ID NO: 327 is the determmed 
SEQ ID NO: 328 is the determmed 
SEQ ID NO: 329 is the determmed 
SEQ ID NO: 330 is the determmed 
SEQ ID NO: 331 is the determmed 
SEQ ID NO: 332 is the determined 
SEQ ID NO: 333 is the determined 
SEQ ED NO: 334 is the determined 
SEQ ID NO: 335 is the determined 
SEQ ID NO: 336 is the determmed 



18 

5' cDNA sequence for Erdsn-12 
5' cDNA sequence for Erdsn-13 
5' cDNA sequence for Erdsn-14 
5' cDNA sequence for Erdsn-15 
5' cDNA sequence for Erdsn-16 
5' cDNA sequence for Erdsn-17 
5' cDNA sequence for Erdsn-18 
5' cDNA sequence for Erdsn-21 
5' cDNA sequence for Erdsn-22 
5' cDNA sequence for Erdsn-23 
5' cDNA sequence for Erdsn-25 
3* cDNA sequence for Erdsn-1 
3' cDNA sequence for Erdsn-2 
3 ' cDNA sequence for Erdsn-4 
3* cDNA sequence for Erdsn-5 
3' cDNA sequence for Erdsn-7 
3' cDNA sequence for Erdsn-8 
3' cDNA sequence for Erdsn-9 
3' cDNA sequence for Erdsn-10 
3* cDNA sequence for Erdsn-12 
3' cDNA sequence for Erdsn-13 
3' cDNA sequence for Erdsn-14 
3' cDNA sequence for Erdsn-15 
3' cDNA sequence for Erdsn-16 
3' cDNA sequence for Erdsn-17 
3' cDNA sequence for Erdsn-18 
3' cDNA sequence for Erdsn-21 
3' cDNA sequence for Erdsn-22 
3' cDNA sequence for Erdsn-23 
3' cDNA sequence for Erdsn-25 
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SEQ ID NO: 337 is the determined cDNA sequence for Erdsn-24 

SEQ ED NO: 338 is the determined amino acid sequence for a M tuberculosis 

85b precursor homolog 

SEQ ID NO: 339 is the determined amino acid sequence for ^t 1 
5 SEQ ED NO: 340 is a determined amino acid sequence for spot 2 

SEQ ED NO: 341 is a determined amino acid sequence for spot 2 

SEQ ID NO: 342 is the determined amino acid seq for spot 4 

SEQ ID NO: 343 is the sequence of primer PDM-I57 

SEQ ID NO: 344 is the sequence of primer PDM-160 
10 SEQ ED NO: 345 is the DNA sequence of the fusion protem TbF-6 

SEQ ED NO: 346 is the amino acid sequence of fusion protein TbF-6 

SEQ ID NO: 347 is the sequence of primer PDM-176 

SEQ ID NO: 348 is the sequence of primer PDM-175 

SEQ ID NO: 349 is the DNA sequence of the fusion protein TbF-8 
15 SEQ ED NO: 350 is the amino acid sequence of the fusion protein TbF-8 

DETAILED DESCRIPTION OF THE INVENTION 

.\s noted above, the present invention is generally directed to 

20 compositions and methods for diagnosing tuberculosis. The compositions of the subject 
mvenuon include polypeptides that compose at least one antigenic portion of a 
M, tuberculosis antigea or a vanant of such an antigen that differs only in conservative 
subsumtions and/or modifications. Polypeptides within the scope of the present 
invention include, but are not limited to, soluble M. tuberculosis antigens. A "soluble 

25 M. tuberculosis antigen" is a protein of M tuberculosis origm that is present in 
M tuberculosis culture filtrate. As used herein, the term "polypeptide" encompasses 
amino acid chains of any length, including full length proteins {i.e., anngens), whereui 
the amino acid residues are linked by covalent peptide bonds. Thus, a polypeptide 
comprising an antigenic portion of one of the above antigens may consist entirely of the 

30 antigenic portion, or may contain additional sequences. The additional sequences may 
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be derived from the native M. tuberculosis antigen or may be heterologous, and such 
sequences may (but need not) be antigenic. 

An "antigenic portion" of an antigen (which may or may not be soluble) 
is a portion that is capable of reacting with sera obtained from an M. tuberculosis- 
5 infected mdividual (z.e., generates an absorbance reading with sera from infected 
individuals that is at least three standard deviations above the absorbance obtained with 
sera from uninfected individuals, in a representative ELISA assay described herein). 
An "M tuberculosiS'Uifccted individual" is a human who has been infected with 
M. tuberculosis {e.g., has an intradermal skin test response to PPD that is at least 0.5 cm 

10 m diameter). Infected individuals may display symptoms of tuberculosis or may be free 
of disease symptoms. Polypeptides comprising at least an antigenic portion of one or 
more M. tuberculosis antigens as described herein may generally be used, alone or in 
combination, to detect tuberculosis in a patient. 

The compositions and methods of the present invention also encompass 

15 variants of the above polypeptides and DNA molecules. A polypeptide "variant," as 
used herein, is a polypeptide that differs from the recited polypeptide only in 
conservative substitutions and/or modifications, such that the therapeutic, antigenic 
and/or immunogenic propenies of die polypeptide are retained. Polypeptide variants 
preferably exhibit at least about 70%, more preferably at least about 90% and most 

20 preferably at least about 95% identity to the identified polypeptides. For polypeptides 
with mimunoreactive properties, variants may, aitemativeiy, be identified by raodifymg 
the anuno acid sequence of one of the above polypeptides, and evaluating the 
immunoreactivity of the modified polypeptide. For polypeptides useftil for the 
generation of diagnostic binding agents, a variant may be identified by evaluating a 

25 modified polypeptide for the ability to generate antibodies that detect the presence or 
absence of tuberculosis. Such modified sequences may be prepared and tested using, 
for example, the representative procedures described herein. 

As used herein, a "conservative substitution" is one in which an amino 
acid IS substituted for another anuno acid that has similar properties, such that one 

30 skilled m the an of peptide chemistry would expect the secondary structure and 
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hydropathic nature of the polypeptide to be subsxantiaily unchanged. In general, the 
foUowmg groups of amino acids represent conservative changes: (1) ala, pro, gly, glu, 
asp, gin, asn, ser, thr; (2) cys, ser, tyr, thr, (3) vai, ilc, leu, met, ala, phe; (4) lys, arg, his; 
and (5) phe, tyr, trp, his. 

5 Variants may also, or alternatively, contain other modifications, 

including the deletion or addition of amino acids that have minimal influence on the 
antigenic properties, secondary structure and hydropathic nature of the polypeptide. For 
example, a polypeptide may be conjugated to a signal (or leader) sequence at the N- 
terminal end of the protein which co-translationally or post-translanonally directs 

10 transfer of the protein. The polypeptide may also be conjugated to a linker or other 
sequence for ease of synthesis, purification or identification of the polypeptide {e.g., 
poly-His), or to enhance binding of the polypeptide to a solid support. For example, a 
polypeptide may be conjugated to an immunoglobulin Fc region. 

A nucleotide 'Variant" is a sequence that differs from the recited 

15 nucleotide sequence in having one or more nucleotide deletions, substitutions or 
additions. Such modifications may be readily introduced using standard mutagenesis 
techniques, such as oligonucleotide-directed site-specific mutagenesis as taught, for 
example, by Adehnan et al. {DMA, 2: 183, 1983). Nucleotide vanants may be naturally 
occumng allelic vanants, or non-naturally occumng variants. Vanant nucleotide 

20 sequences preferably exhibit at least about 70%, more preferably at least about 80% and 
most preferably at least about 90% identity to the recited sequence. Such vanant 
nucleotide sequences will generally hybndize to the recite nucleotide sequence under 
stringent conditions. As used herein, ''stnngent conditions'' refers to prewashing m a 
solution of 6X SSC, 0.2°^a SDS; hybndizmg at 65 T, 6X SSC, 0.2% SDS ovenught; 

25 followed by two washes of 30 mmutes each m IX SSC, 0.1% SDS at 65 'C and two 
washes of 30 mmutes each in 0.2X SSC, 0.1% SDS at 65 T. 

In a related aspect, combination, or fusion, polypeptides are disclosed. A 
"fiision polypeptide" is a polypeptide comprising at least one of the above antigemc 
portions and one or more additional antigenic M. tuberculosis sequences, which are 

30 joined via a peptide linkage into a single amino acid chain. The sequences may be 
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joined directly (/.e. with no iniervening amino acids) or may be joined by way of a 
linker sequence {e.g., Gly-Cys-Gly) that does not significantly diminish the antigenic 
properties of the component polypeptides. 

In general, M. tuberculosis antigens, and DNA sequences encoding such 
5 antigens, may be prepared using any of a variety of procedures. For example, soluble 
antigens may be isolated from M. tuberculosis ctilture filtrate by procedures known to 
those of ordinary skill in the art, incltiding anion-exchange and reverse phase 
chromatography. Purified antigens may then be evaluated for a desired property, such 
as the ability to react with sera obtained from an M. tuberculosis-imtcltd individual. 

10 Such screens may be performed using the representative methods described herein. 
Antigens may then be partially sequenced using, for example, traditional Edman 
chemisn7. See Edman and Berg, Eur, I Biochem. SO: 116-132, 1967. 

Antigens may also be produced recombinantly using a DNA sequence 
that encodes the antigen, which has been inserted into an expression vector and 

15 expressed in an appropriate host. DNA molecules encoding soluble antigens may be 
isolated by screening an appropriate M. tuberculosis expression library with anti-sera 
(e.g., rabbit) raised specifically against soluble M. tuberculosis antigens. DNA 
sequences encoding antigens that may or may not be soluble may be identified by 
screenmg an appropnate M. tuberculosis genomic or cDNA expression library with sera 

20 obtamed from patients infected with M. tuberculosis. Such screens may generally be 
performed usmg techmques well known m the an, such as those descnbed m Sambrook 
et al.. Molecular Cloning: A Laboratory ManuaL Cold Spnng Harbor Laboratones, 
Cold Spnng Harbor, NY. 1989. 

DNA sequences encoding soluble antigens may also be obtained by 

25 screenmg an appropriate M. tuberculosis cDNA or genomic DNA library for DNA 
sequences diat hybridize to degenerate oligonucleotides derived from partial amino acid 
sequences of isolated soluble antigens. Degenerate oligonucleotide sequences for use in 
such a screen may be designed and synthesized, and the screen may be performed, as 
described (for example) in Sambrook et al.. Molecular Cloning: A Laboratory Manual, 

30 Cold Spnng Harbor Laboratories, Cold Spring Harbor, NY (and references cited 
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therein). Polymerase chain reaction (PCR) may also be employed, using the above 
ohgonucleotides in methods well known in the art, to isolate a nucleic acid probe from a 
cDNA or genomic library. The library screen may then be performed using the isolated 
probe. 

5 Regardless of the method of preparation, the antigens described herein 

are "antigenic.*' More specifically, the antigens have the abiUty to react with sera 
obtained from an M, tuberculosis-infected individual. Reactivity may be evaluated 
using, for example, the representative ELISA assays described herein, where an 
absorbance reading with sera from infected individuals that is at least three standard 

10 deviations above the absorbance obtained with sera from uninfected individuals is 
considered posiuve. 

Antigenic portions of M. tuberculosis antigens may be prepared and 
identified using well known techniques, such as those summarized in Paul, 
Fundamental Immunology, 3d ed., Raven Press, 1993, pp. 243-247 and references cited 

15 therein. Such techniques include screening polypeptide portions of the native antigen 
for antigenic properties. The representative ELISAs described herein may generally be 
employed in these screens. An antigenic portion of a polypeptide is a portion that, 
withm such representative assays, generates a signal in such assays diat is substantially 
similar to that generated by the fiill length antigen. In other words, an aniigemc portion 

20 of a M, tuberculosis antigen generates at least about 20%, and preferably about 100%, 
of the signal mduced by the full length antigen m a model ELISA as described herein. 

Portions and other vanants of M. tuberculosis antigens may be generated 
by synthetic or recombinant means. Synthetic polypeprides having fewer than about 
100 amino acids, and generally fewer than about 50 amino acids, may be generated 

25 using techniques well known in the art. For example, such polypeptides may be 
synthesized using any of the commercially available solid-phase techmques, such as the 
Merrifield solid-phase synthesis method, where amino acids are sequentially added to a 
growing amino acid chain. See Merrifield, 7. Am, Chem. Soc. 55:2149-2146, 1963. 
Eqmpment for automated synthesis of polypeptides is commercially available from 

30 suppliers such as Applied BioSystems, Inc., Foster City, CA, and may be operated 
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according to the manufacturer's mstructions. Variants of a native andgcn may generally 
be prepared using standard mutagenesis techniques, such as oUgonucieotide-directed 
site-specific mutagenesis. Sections of the DNA sequence may also be removed using 
standard techniques to permit preparation of truncated polypeptides. 

5 Recombinant poljqpeptides containing portions and/or variants of a 

native antigen may be readily prepared from a DNA sequence encoding the polypeptide 
using a variety of techniques well known to those of ordinary sJdll in the art. For 
example, supematants from suitable host/vector systems which secrete recombinant 
protein into culture media may be first concentrated using a commercially available 

10 filter. Following concentration, the concentrate may be apphed to a suitable 
purification mauix such as an affinity mattix or an ion exchange resin. Finally, one or 
more reverse phase HPLC steps can be employed to fiirther purify a recombinant 
protein. 

Any of a variety of expression vectors known to those of ordinary skill in 
15 the art may be employed to express recombinant polypeptides as described herein. 
Expression may be achieved in any appropriate host cell that has been transformed or 
transfected with an expression vector containing a DNA molecule that encodes a 
recombinant polypeptide. Suitable host cells include prokaryotes, yeast and higher 
eukaryotic cells. Preferably, the host ceils employed are E. coli. yeast or a mammalian 
20 cell line, such as COS or CHO. The DNA sequences expressed in this maimer may 
encode naturally occurrmg antigens, portions of naturally occurring antigens, or other 
vanants thereof 

In general, regardless of the method of preparation, the polypeptides 
disclosed herein are prepared in substantially pure form. Preferably, the polypeptides 
25 are at least about 80% pure, more preferably at least about 90% pure and most 
preferably at least about 99% pure. For use in the methods described herem, however, 
such substantially pure polypeptides may be combined. 

In certam specific embodiments, the subject invention discloses 
polypeptides comprismg at least an antigenic portion of a soluble M tuberculosis 
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antigen (or a variant of such an antigen), where the antigen has one of the following N- 

tenninal sequences: 

(a) Asp-PiD-Val-Asp-Ala-Val-ne-Asn-Thr-Thr-Cys-Asn-Tyr-Gly- 

Ghi-Val-Val-Ala- Ala-Leu (SEQ ID NO: 1 15); 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro- 

Ser(SEQIDNO: 116); 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala- 

Ala-Lys-Glu-Gly-Arg (SEQ ID NO: 117); 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly- 
Pro(SEQIDNO: 118); 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val 

(SEQ ID NO: 119); 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ue-Val-Pro (SEQ ID 

NO: 120); 

(g) Asp-Pro-Glu-Prc-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser- 

Pro-Pro-Ser (SEQ ID NO: 121); 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr- 

Gly(SEQIDNO: 122); 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-.'Ua-Gln-Gln- 
Thr-Ser-Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe- 

Ala-Asn (SEQ ID NO: 123); 
(j) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-Lys-Val-Thr-Asp-Ala- 

Ser; (SEQ ID NO: 129) 
(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile- Val-Gly- Asn-Leu-Thr-Ala- 

Asp; (SEQ ID NO: 130) or 
(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala- 

Gly; (SEQ ID NO: 131) 
wherein Xaa may be any amino acid, preferably a cysteine residue. A DNA sequence 
encoding the antigen identified as (g) above is provided in SEQ ID NO: 52, the deduced 
amino acid sequence of which is provided in SEQ ED NO: 53. A DNA sequence 
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encoding the antigen identified as (a) above is provided in SEQ ID NO: 96; its deduced 
amino acid sequence is provided in SEQ ID NO: 97. A DNA sequence corresponding 
to antigen (d) above is provided in SEQ ID NO: 24, a DNA sequence corresponding to 
antigen (c) is provided in SEQ ID NO: 25 and a DNA sequence corresponding to 
5 antigen (I) is disclosed in SEQ ID NO: 94 and its deduced amino acid sequence is 
provided m SEQ ID NO: 95. 

In a fiiither specific embodiment, the subject invention discloses 
polypeptides comprising at least an immunogenic portion of an M tuberculosis andgen 
having one of the following N-terminal sequences, or a variant thereof that differs only 
0 in conservative substimtions and/or modifications: 

(m) Xaa-Tyr-ne-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-De-Val-Pro-Gly-Lys- 

ne-Asn-Val-His-Leu-Val; (SEQ ED NO: 132) or 
(n) Asp-Pro-Pro-Asp'Pro-His-Gto-Xaa-Asp-Met-Thr-Lys-Gly-Tyr- 
Tyr-Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID NO: 124) 
5 wherein Xaa may be any amino acid, preferably a cysteine residue. . A DNA sequence 
encoding the antigen of (n) above is provided in SEQ ID NO: 235, with the 
corresponding predicted full-length amino acid sequence being provided in SEQ ID 
NO: 236. 

In other specific embodiments, the subject invention discloses 
polypeptides comprismg at least an antigenic portion of a soluble M. tuberculosis 
antigen (or a vanant of such an antigen) that compnses one or more of the amino acid 
sequences encoded by (a) the DNA sequences of SEQ ED NOS: 1, 2, 4-10, 13-25, 52, 94 
and 96. (b) the complements of such DNA sequences, or (c) DNA sequences 
5 substantially homologous to a sequence in (a) or (b). 

In further specific embodiments, the subject invention discloses 
polypeptides compnsing at least an antigenic portion of a M. tuberculosis antigen (or a 
vanant of such an antigen), which may or may not be soluble, that comprises one or 
more of the amino acid sequences encoded by (a) the DNA sequences of SEQ ED 
0 NOS: 26-51, 133, 134, 158-178, 184-188, 194-196, 198, 210-220, 232. 234, 235, 237- 
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242, 248-251, 256-271, 287, 288, 290-293 and 298-337, (b) the complements of such 
DNA sequences or (c) DNA sequences substantially homologous to a sequence in (a) or 

(b). 

In a related aspect, the present invention provides fusion proteins 
5 comprising a first and a second inventive polypeptide or, alternatively, a polypeptide of 
the present invention and a known M tuberculosis antigen, such as the 38 kD antigen 
described in Andersen and Hansen, Infect, Immun, 57:2481-2488, 1989, (Genbank 
Accession No. M30046) or ESAT-6 (SEQ ID NOS: 98 and 99), together with variants 
of such fusion proteins. The fusion protems of the present invention may also include a 

10 linker peptide between the first and second polypeptides. 

A DNA sequence encoding a fusion protein of the present invention is 
constructed using known recombinant DNA techniques to assemble separate DNA 
sequences encoding the first and second polypeptides into an appropriate expression 
vector. The 3' end of a DNA sequence encoding the first polypeptide is ligated, with or 

15 without a peptide linker, to the 5' end of a DNA sequence encoding the second 
polypeptide so that the reading fi^es of the sequences are in phase to permit mRNA 
translation of the two DNA sequences into a single fusion protein that retains the 
biological activity of both the first and the second polypeptides. 

A peptide linker sequence may be employed to separate the first and the 

20 second polypeptides by a distance sufficient to ensure that each polypeptide folds into 
Its secondary and tertiary structures. Such a peptide linker sequence is incorporated into 
the fusion protem usmg standard techniques well known in the art. Suitable peptide 
linker sequences may be chosen based on the following factors: (1) their abihty to 
adopt a flexible extended confonnation; (2) their inability to adopt a secondary structure 

25 that could interact with functional epitopes on the first and second polypeptides; and 
(3) the lack of hydrophobic or charged residues that might react with the polypeptide 
functional epitopes. Preferred peptide linker sequences contain Gly, Asn and Ser 
residues. Other near neutral amino acids, such as Thr and Ala may also be used in the 
linker sequence. Ammo acid sequences which may be usefully employed as linkers 

30 include those disclosed in Maratea et aL, Gene 40:39-46, 1985; Murphy et aL, Proc. 
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NatL Acad, Set USA 5J:8258-8562, 1986; U.S. Patent No. 4,935,233 and U.S. Patent 
No. 4,751,180. The linker sequence may be from 1 to about 50 amino acids in length. 
Peptide hnkcr sequences are not required when the first and second polypeptides have 
non-essential N-terminal amino acid regions that can be used to separate the functional 
5 domains and prevent steric hindrance. 

In another aspect, the present invention provides methods for using the 
polypeptides described above to diagnose tuberculosis. In this aspect, methods are 
provided for detecting M. tuberculosis infection in a biological sample, using one or 
more of the above polypeptides, alone or in combination. In embodiments in which 

10 multiple polypeptides are employed, polypeptides other than those specifically 
described herem, such as the 38 kD antigen described in Andersen and Hansen, Infect. 
Immun, 57:2481-2488, 1989, may be included As used herein, a "biological sample" is 
any antibody-containing sample obtained from a patient. Preferably, the sample is 
whole blood, sputum, serum, plasma, sahva, cerebrospinal fluid or urine. More 

15 preferably, the sample is a blood, serum or plasma sample obtained from a patient or a 
blood supply. The polypeptide(s) are used in an assay, as described below, to determine 
the presence or absence of antibodies to the polypeptide(s) in the sample, relative to a 
predetermined cut-off value. The presence of such antibodies indicates previous 
sensitization to mycobacterial antigens which may be indicative of tuberculosis. 

^0 In embodiments m which more than one polypeptide is employed, the 

polypeptides used are preferably complementary (i.e., one component polypeptide will 
tend to detect mfection in samples where the infection would not be detected by another 
component polypeptide). Complementary polypeptides may generally be identified by 
using each polypeptide individually to evaluate serum samples obtained from a series of 

25 patients known to be infected with M. tuberculosis. After determining which samples 
test positive (as described below) with each polypeptide, combinations of two or more 
polypeptides may be formulated that are capable of detecting infection in most, or all, of 
the samples tested. Such polypeptides are complementary. For example, approximately 
25-30% of sera from tuberculosis- infected individuals are negative for antibodies to any 

30 single protein, such as the 38 kD antigen mentioned above. Complementary 
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polypq)tides may, therefore, be used in combination with the 38 kD antigen to improve 
sensitivity of a diagnostic test. 

There are a variety of assay formats known to those of ordinary skill m 
the art for using one or more polypeptides to detect antibodies m a sample. See, e.g.» 
5 Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory, 
1988, which is incorporated herein by reference. In a preferred embodiment, the assay 
involves the use of polypeptide immobilized on a sohd support to bind to and remove 
the antibody from the sample. The bound antibody may then be detected using a 
detection reagent that contains a reporter group. Suitable detection reagents include 

10 antibodies that bind to the antibody/polypcptide complex and free polypeptide labeled 
with a reporter group (e.g., in a semi-competitive assay). Alternatively, a competitive 
assay may be utilized, in which an antibody that binds to the polypeptide is labeled with 
a reporter group and allowed to bind to the immobilized antigen after incubation of the 
antigen with the sample. The extent to which components of the sample inhibit the 

15 binding of the labeled antibody to the polypeptide is indicative of the reactivity of the 
sample with the immobilized polypeptide. 

The solid support may be any solid material known to those of ordinary 
skill in the art to which the antigen may be attached. For example, the solid support 
may be a test well in a microliter plate or a nitrocellulose or other suitable membrane. 

20 Alternatively, the support may be a bead or disc, such as glass, fiberglass, latex or a 
plastic material such as polystyrene or polyvinylchlonde. The support may also be a 
magnetic particle or a fiber opnc sensor, such as those disclosed, for example, in U.S. 
Patent No. 5,359,681. 

The polypeptides may be bound to the solid support usmg a variety of 

25 techniques known to those of ordinary skill in the art, which are amply described in the 
patent and scientific literature. In the context of the present invention, the term "bound" 
refers to both noncovaient association, such as adsorption, and covalent attachment 
(which may be a direct linkage between the antigen and functional groups on the 
support or may be a linkage by way of a cross-linkmg agent). Binding by adsorption to 

30 a well in a microtiter plate or to a membrane is preferred. In such cases, adsorption may 
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be achieved by contacting the polypeptide, in a suitable buffer, with the solid support 
for a suitable amount of time. The contact time varies with temperature, but is typically 
between about 1 hour and 1 day. In general, contacting a well of a plastic microtiter 
plate (such as polystyrene or polyvinylchloride) with an amount of polypeptide ranging 
5 from about 10 ng to about 1 \xg, and preferably about 100 ng, is sufficient to bind an 
adequate amount of antigen. 

Covalent attachment of polypeptide to a solid support may generally be 
achieved by first reacting the support with a bifunctional reagent that will react with 
both the support and a ftmctional group, such as a hydroxyl or amino group, on the 

10 polypeptide. For example, the polypeptide may be bound to supports havmg an 
appropriate polymer coating using benzoquinone or by condensation of an aldehyde 
group on the support with an amine and an active hydrogen on the polypeptide (see, 
e.g.. Pierce Immunotechnology Catalog and Handbook, 1991, at A12-A13). 

In certain embodiments, the assay is an enzyme linked immunosorbent 

15 assay (ELISA). This assay may be performed by first contacting a polypeptide antigen 
that has been immobilized on a soHd support, commonly the well of a microtiter plate, 
with the sample, such that antibodies to the polypeptide within the sample are allowed 
to bind to the immobihzed polypeptide. Unbound sample is then removed from the 
immobilized polypeptide and a detection reagent capable of binding to the immobilized 

20 antibody-polypeptide complex is added. The amount of detection reagent that remams 
bound to the solid support is then determmed using a method appropnate for the 
specific detection reagent- 
More specifically, once the polypeptide is immobilized on the support as 
described above, the remaining protein binding sites on the suppon are typically 

25 blocked. Any suitable blocking agent known to those of ordinary skill m the art, such 
as bovine serum albumin or Tween 20'^'^ (Sigma Chemical Co., St. Louis, MO) may be 
employed. The immobilized polypeptide is then incubated with the sample, and 
antibody is allowed to bind to the antigen. The sample may be diluted with a suitable 
diluent, such as phosphate-buffered saline (PBS) pnor to incubation. In general, an 

30 appropriate contact time (i.e., incubation time) is that period of time that is sufficient to 
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detect the presence of antibody within a M. tuberculosis-mfaztcd sample. Preferably, 
the contact time is sufficient to achieve a level of bmding that is at least 95% of that 
achieved at equilibrium between bound and unbound antibody. Those of ordinary skill 
in the art will recognize that the time necessary to achieve equilibrium may be readily 
5 determined by assaying the level of binding that occurs over a period of time. At room 
temperature, an incubation time of about 30 mmutes is generally sufficient. 

Unbound sample may then be removed by washing the solid support 
with an appropriate buffer, such as PBS containing 0.1% Tween 20™. Detection 
reagent may then be added to the solid support. An appropriate detection reagent is any 

10 compound that binds to the immobilized antibody-polypeptide complex and that can be 
detected by any of a variety of means known to those in the art. Preferably, the 
detection reagent contains a binding agent (such as, for example, Protem A, Protein G, 
immunoglobulin, lectin or free antigen) conjugated to a reporter group. Preferred 
reporter groups include enzymes (such as horseradish peroxidase), substrates, cofactors, 

15 inhibitors, dyes, radionuclides, luminescent groups, fluorescent groups, biotin and 
colliodal particles, such as colloidal gold and selenium. The conjugation of binding 
agent to reporter group may be achieved using standard methods known to those of 
ordinary skill in the an. Common binding agents may also be purchased conjugated to 
a variety of reporter groups from many commercial sources {e.g., Zymed Laboratones, 

20 San Francisco, CA, and Pierce, Rockford, IL). 

The detection reagent is then incubated with the immobilized antibody- 
polypeptide complex for an amount of time sufficient to detect the bound antibody. An 
appropnate amount of time may generally be detenmned from the manufacturer's 
insnnctions or by assaying the level of binding that occurs over a period of time. 

25 Unbound detection reagent is then removed and bound detection reagent is detected 
using the reporter group. The method employed for detecting the reporter group 
depends upon the nature of the reporter group. For radioactive groups, scintillation 
countmg or autoradiographic methods are generally appropriate. Spectroscopic 
methods may be used to detect dyes, luminescent groups and fluorescent groups. Biotin 

30 may be detected using avidin, coupled to a different reporter group (commonly a 
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radioactive or fluorescent group or an enzyme). Enzyme reporter groups may generally 
be detected by the addition of substrate (generally for a specific period of time), 
followed by spectroscopic or other analysis of the reaction products. 

To determine the presence or absence of anti-M. tuberculosis antibodies 
5 in the sample, the signal detected from the reporter group that remains bound to the 
solid support is generally compared to a signal that corresponds to a predetermined cut- 
ofiF value. In one preferred embodiment, the cut-ofF value is the average mean signal 
obtamed when the immobilized antigen is incubated with samples from an uninfected 
patient. In general, a sample generating a signal that is three standard deviations above 

10 the predetermined cut-off value is considered positive for tuberculosis. In an alternate 
preferred embodiment, the cut-off value is determined using a Receiver Operator Curve, 
according to the method of Sackett et al., Clinical Epidemiology: A Basic Science for 
Clinical Medicine, Little Brown and Co., 1985, pp. 106-107. Bnefly, m this 
embodiment, the cut-oflF value may be determined from a plot of pairs of true positive 

15 rates {i,e., sensitivity) and false positive rates (100%-specificity) that correspond to each 
possible cut-off value for the diagnostic test result. The cut-oflf value on the plot that is 
the closest to the upper left-hand comer (i.e., the value that encloses the largest area) is 
the most accurate cut-off value, and a sample generating a signal that is higher than the 
cut-off value determmed by this method may be considered positive. Alternatively, the 

20 cut-off value may be shifted to the left along the plot, to mimmize the false positive 
rate, or to the right, to minimize the false negative rate. In general, a sample generating 
a signal that is higher than the cut-off value determined by this method is considered 
positive for tuberculosis. 

In a related embodiment, the assay is performed in a rapid flow-through 

25 or sunp test format, wherein the antigen is immobilized on a membrane, such as 
mtrocellulose. In the flow-through test, antibodies within the sample bind to the 
immobilized polypeptide as the sample passes through the membrane. A detection 
reagent (e.g., protem A-colloidal gold) then binds to the antibody-polypeptide complex 
as the solution contaming the detection reagent flows through the membrane. The 

30 detection of bound detection reagent may then be performed as described above. In the 
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strip test format, one end of the membrane to which polypeptide is bound is immersed 
in a solution containing the sample. The sample migrates along the membrane through 
a region containing detection reagent and to the area of immobilized polypeptide. 
Concentration of detection reagent at the polypeptide indicates the presence of anti- 
5 M tuberculosis antibodies in the sample. Typically, the concentration of detection 
reagent at that site generates a pattern, such as a line, that can be read visually. The 
absence of such a pattern indicates a negative result. In general, the amount of 
pol>T)eptide immobihzed on the membrane is selected to generate a visually discernible 
pattern when the biological sample contains a level of antibodies that would be 
10 sufficient to generate a positive signal in an ELISA, as discussed above. Preferably, the 
amount of polypeptide immobilized on the membrane ranges fi-om about 25 ng to about 
1 ^ig, and more preferably from about 50 ng to about 500 ng. Such tests can typically 
be performed with a very small amount (e.g,, one drop) of patient serum or blood. 

Of course, numerous other assay protocols exist that are suitable for use 
15 with the polypeptides of the present invention. The above descriptions are intended to 
be exemplary only. 

In yet another aspect, the present invention provides antibodies to the 
inventive polypeptides. .Antibodies may be prepared by any of a variety of techniques 
known to those of ordinary skill in the an. See. e.g., Harlow and Lane, Antibodies: A 
20 Laboratory^ Manual, Cold Spnng Harbor Laboratory, 1988. In one such techmque, an 
immunogen compnsmg the antigenic polypeptide is imtially injected mto any of a wide 
vanety of mammals (e.g., mice, rats, rabbits, sheep and goats), hi this step, the 
polypeptides of this invention may serve as the immunogen without modification. 
Altematively, particularly for relatively shon polypeptides, a supenor immune response 
25 may be elicited if the polypeptide is joined to a carrier protein, such as bovine serum 
albumin or keyhole limpet hemocyanin. The immunogen is injected uiio the animal 
host, preferably according to a predetermined schedule incorporating one or more 
booster immunizations, and the animals are bled periodically. Polyclonal antibodies 
specific for the polypeptide may then be purified from such annsera by, for example, 
30 affimty chromatography usmg the polypeptide coupled to a suitable solid support. 
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Monoclonal antibodies specific for the antigenic polypeptide of interest 
may be prepared, for example, using the technique of Kohler and Milstein, Eur. 1 
Immunol. 5:511-519. 1976, and improvements thereto. Briefly, these methods involve 
the preparation of immortal cell lines capable of producing antibodies having the 
desired specificity (/.e., reactivity with the polypeptide of interest). Such cell lines may 
be produced, for example, fi^m spleen cells obtained from an animal immunized as 
described above. The spleen cells are then immortalized by, for example, fusion with a 
myeloma cell fusion partner, preferably one that is syngeneic with the immunized 
animal. A vanety of fusion techniques may be employed. For example, the spleen cells 
and myeloma cells may be combined with a nonionic detergent for a few minutes and 
then plated at low density on a selective medium thai supports the growth of hybnd 
cells, but not myeloma cells. A preferred selection technique uses HAT (hypoxanthme, 
aminopterin, thymidine) selection. After a sufficient time, usually about 1 to 2 weeks, 
colomes of hybrids are observed. Single colonies are selected and tested for binding 
activity against the polypeptide. Hybridomas having high reactivity and specificity are 
preferred. 

Monoclonal antibodies may be isolated fi^m the supematants of growing 
hybndoma colonies. In addition, vanous techmques may be employed to enhance the 
yield, such as mjection of the hybndoma cell line into the peritoneal cavity of a suitable 
venebraie host, such as a mouse. Monoclonal antibodies may then be harvested from 
the ascites fluid or the blood. Contammants may be removed from the antibodies by 
conventional techmques, such as chromatography, gel filtration, precipitation, and 
extraction. The polypeptides of this invention may be used in the purification process 
in, for example, an afiSnity chromatography step. 

Antibodies may be used in diagnostic tests to detect the presence of 
M tuberculosis antigens using assays similar to those detailed above and other 
techniques well known to those of skill m the art, thereby providing a method for 
detecting M tuberculosis infection in a patient. 

Diagnostic reagents of the present invention may also comprise DNA 
sequences encoding one or more of the above polypeptides, or one or more portions 
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thereof. For example, at least two oligonucleotide primers may be employed in a 
polymerase chain reaction (PGR) based assay to amplify M tuberculosis-speci&c 
cDNA derived from a biological sample, wherein at least one of the oligonucleotide 
primers is specific for a DNA molecule encoding a polypeptide of the present invention. 
5 The presence of the amplified cDNA is then detected using techniques well known in 
the art, such as gel electrophoresis. Similarly, oUgonucleotide probes specific for a 
DNA molecule encoding a polypeptide of the present invention may be used in a 
hybridization assay to detect the presence of an inventive polypeptide m a biological 
sample. 

10 As used herein, the term '^oligonucleotide primer/probe specific for a 

DNA molecule" means an oUgonucleotide sequence thai has at least about 80%, 
preferably at least about 90% and more preferably at least about 95%, identity to the 
DNA molecule in question. OUgonucleotide primers and/or probes which may be 
usefully employed in the inventive diagnostic methods preferably have at least about 

15 10-40 nucleotides. In a preferred embodiment, the oligonucleotide primers comprise at 
least about 10 contiguous nucleotides of a DNA molecule encoding one of the 
polypeptides disclosed herein. Preferably, oUgonucleotide probes for use in the 
inventive diagnostic methods comprise at least about 15 contiguous oligonucleotides of 
a DNA molecule encoding one of the polypeptides disclosed herein. Techniques for 

20 both PCR based assays and hybndization assays are well known in the art (see. for 
example, MuUis et at. Ibid; Ehrlich, Ibid). Pnmers or probes may thus be used to detect 
M tubercuiosis-sptciiic sequences in biological samples. DNA probes or pnmers 
compnsing oligonucleotide sequences described above may be used alone, in 
combination with each other, or with previously identified sequences, such as the 38 kD 

25 antigen discussed above. 

The foUowmg Examples are offered by way of illustration and not by 
way of limitation. 
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EXAMPLES 
EXAMPLE] 

Pl/RinCATTON AND CHAR ACTFRTZATTON OF POLY?EPTmF<s 

FRQMM mRFRnn.os/sCin.nmFVniiiAir 

This example illustrates the preparation of M tuberculosis soluble 
polypeptides from culture filtrate. Unless otherwise noted, all percentages in the 
following example arc weight per volume. 

M, tuberculosis (either H37Ra, ATCC No. 25177. or H37Rv, ATCC 
No. 25618) was culnired in sterile GAS media at 37°C for founeen days. The media 
was then vacuum filtered (leaving the bulk of the cells) through a 0.45 ^ filter into a 
sterile 2.5 L bottle. The media was then filtered through a 0.2 ji filter into a sterile 4 L 
bottle. NaNj was then added to the culture filtrate to a concentration of 0.04%. The 
bottles were then placed in a 4^C cold room. 

The culture filtrate was concentrated by placing the filtrate in a 12 L 
reservoir that had been autoclaved and feeding the filtrate into a 400 ml Amicon stir cell 
which had been nnsed with ethanol and contamed a 10,000 kDa MWCO membrane. 
The pressure was maintained at 60 psi usmg nitrogen gas. This procedure reduced the 
12 L volume to approximately 50 ml. 

The culture filtrate was then dialyzed into 0.1% ammomum bicarbonate 
using a 8,000 kDa MWCO cellulose ester membrane, with two changes of ammonium 
bicarbonate solution. Protein concentration was then determmed by a commercially 
available BCA assay (Pierce, Rockford, IL). 

The dialyzed culture filtrate was then lyophilized, and the polypepndes 
resuspended in distilled water. The polypeptides were then dialyzed against 0.01 miM 
1,3 bis[tris(hydroxymethyl)-methylaniino]propane, pH 7.5 (Bis-Tns propane buffer), 
the initial conditions for anion exchange chromatography. Fractionation was performed 
using gel profusion chromatography on a POROS 146 II Q/M anion exchange column 
4.6 mm X 100 mm (Perseptive BioSystems, Frammgham. MA) equilibrated m 0.01 mM 
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Bis-Tris propane buffer pH 7.5. Polypeptides were elutcd with a linear 0-0.5 M NaCl 
gradient in the above buffer system. The column eluent was monitored at a wavelength 
of 220 rnn. 

The pools of polypeptides eluting from the ion exchange column were 
5 dialyzed against distilled water and lyophilized. The resulting material was dissolved in 
0.1% trifluoroacetic acid (TFA) pH 1.9 in water, and the polypeptides were purified on 
a Delta-Pak CI 8 column (Waters, Milford, MA) 300 Angstrom pore size, 5 micron 
particle size (3.9 x 150 mm). The polypeptides were eluted from the column with a 
linear gradient from 0-60% dilution buffer (0.1% TFA in acetonitrile). The flow rate 

10 was 0.75 ml/minute and the HPLC eluent was monitored at 214 nm. Fractions 
containing the eluted polypeptides were collected to maximize the purity of the 
individual samples. Approximately 200 purified polypeptides were obtained. 

The purified polypeptides were then screened for the ability to induce T- 
cell proliferation in PBMC preparations. The PBMCs from donors known to be PPD 

15 skin test positive and whose T cells were shown to proliferate in response to PPD and 
crude soluble proteins irom MTB were cultured in medium compnsmg RPMI 1640 
supplemented with 10% pooled human serum and 50^g/ml gentamicin. Purified 
polypeptides were added in duplicate at concentrations of 0.5 to 10 ug/mL. After six 
days of culture in 96-well round-bottom plates in a volume of 200 ul, 50 |il of medium 

20 was removed from each well for determination of IFN-y levels, as described below. 
The plates were then pulsed with 1 |j.Ci/well of tntiated thymidine for a further 18 
hours, harvested and tritium uptake determined using a gas scintillation counter. 
Fractions that resulted in proliferation in both replicates three fold greater than the 
proliferation observed in cells cultured in medium alone were considered positive. 

25 IFN-y was measured using an enzyme-linked immunosorbent assay 

(ELISA). ELISA plates were coated with a mouse monoclonal antibody directed to 
human EFN-y (Chemicon) in PBS for four hours at room temperature. Wells were then 
blocked with PBS containing 5% (W/V) non-fat dried milk for 1 hour at room 
temperanire. The plates were then washed six times in PBS/0.2% TWEEN-20 and 

30 samples diluted 1:2 in culture medium in the ELISA plates were incubated overnight at 
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room temperature. The plates were again washed and a polyclonal rabbit aati-human 
IFN-y serum diluted 1:3000 in PBS/10% normal goat serum was added to each well 
The plates were then incubated for two hours at room temperature, washed and 
horseradish peroxidase-coupled anti-rabbit IgG (Jackson Labs.) was added at a 1:2000 
dilution in PBS/5% non-fat dried milk. After a fiirther two hour incubation at room 
temperature, the plates were washed and TMB substrate added. The reaction was 
stopped after 20 min with 1 N sulfuric acid. Optical density was determined at 450 nm 
using 570 nm as a reference wavelength. Fractions that resulted in both replicates 
giving an OD two fold greater than the mean OD from cells cultured in medium alone, 
plus 3 standard deviations, were considered positive. 

For sequencmg, the polypeptides were mdividuaily dried onto 
Biobrene™ (Perkin Ehner/ Applied BioSystems Division, Foster City, CA) treated glass 
fiber filters. The filters with polypeptide were loaded onto a Perkin Elmer/Applied 
BioSystems Division Procise 492 protein sequencer. The polypeptides were sequenced 
from the amino terminal and using traditional Edman chemistry. The amino acid 
sequence was determined for each polypeptide by comparing the retention time of the 
PTH amino acid derivative to the appropriate PTH derivative standards. 

Using the procedure descnbed above, antigens having the following 
N-teiminal sequences were isolated: 

(a) Asp-Pro-Val-.\sp-Ala-Val-Ile-Asn-Thr-Thr-Xaa-Asn-Tyr-Gly- 
Gb-Val-Vai-Ala-Ala-Leu (SEQ ID NO: 54); 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro- 
Ser(SEQIDNO: 55); 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala- 
Ala-Lys-Glu-Gly-Arg (SEQ CD NO: 56); 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly- 
Pro (SEQ ID NO: 57); 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gh-Xaa-Ala-Val 
(SEQ ID NO: 58); 
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(f) AJa-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 
NO: 59); 

(g) Asp-PrcHGlu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Ala-Ala-Ala-AJa 

Pro-Pro-Ala (SEQ ID NO: 60); and 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr- 

Gly (SEQ ID NO: 61); 
wherein Xaa may be any amino acid. 

An additional antigen was isolated employing a microbore HPLC 
purification step in addition to the procedure described above. Specifically, 20 ul of a 
fraction comprising a mixture of antigens from the chromatographic purification step 
previously described, was punfied on an Aquapore C18 column (Perkin Elmer/ Applied 
Biosystems Division, Foster City, CA) with a 7 micron pore size, column size 1 mm x 
100 mm, in a Perkin Elmer/ Applied Biosystems Division Model 172 HPLC. Fractions 
were eluted from the column with a linear gradient of 1%/minute of acetomtrile 
(containing 0.05% TFA) m water (0.05% TFA) at a flow rate of 80 ^l/mmute. The 
eluent was monitored at 250 nm. The original fiction was separated into 4 major peaks 
plus other smaller components and a polypeptide was obtained which was shown to 
have a molecular weight of 12.054 Kd (by mass spectrometry) and the following N- 

termmal sequence: 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Aia-Ala-Ghi-Gln- 

Thr-Ser-Leu-Leu-Asn-Asn.Leu-Ala-Asp-Pro-.\sp-Val-Ser-Phe- 

Aia-.Asp(SEQIDNO: 62). 
This polypeptide was shown to induce proliferation and IFN-y production in PBMC 
preparations using the assays described above. 

Additional soluble antigens were isolated from M. tuberculosis culture 
filtrate as follows. M. tuberculosis culture filtrate was prepared as described above. 
Following dialysis against Bis-Tns propane buffer, at pH 5.5, fi^crionation was 
pertbrmed usmg amon exchange chromatography on a Poros QE column 4.6 x 100 mm 
(Perseptive Biosystems) equihbrated in Bis-Tns propane buffer pH 5.5. Polypeptides 
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were eluted with a linear 0-1.5 M NaCl gradient in the above buffer system at a flow 
rate of 10 ml/min. The column eiuent was monitored at a wavelength of 214 nm. 

The fractions eluting from the ion exchange colmim were pooled and 
subjected to reverse phase chromatography using a Poros R2 column 4.6 x 100 mm 
5 (Perseptive Biosystems). Polypeptides were eluted from the column with a linear 
gradient from 0-100% acetonitrile (0.1% TFA) at a flow rate of 5 ml/min. The eiuent 
was monitored at 214 nm. 

Fractions containing the eluted polypeptides were lyophiUzed and 
resuspended in 80 .ul of aqueous 0.1% TFA and further subjected to reverse phase 
10 chromatography on a Vydac C4 column 4.6 x 150 mm (Western Analytical, Temecuia, 
CA) with a linear gradient of 0-100% acetonitrile (0.1% TFA) at a flow rate of 2 
ml/min. Eiuent was monitored at 214 nm. 

The fraction with biological activity was separated into one major peak 
plus other smaller components. Western blot of this peak onto PVDF membrane 
15 revealed three major bands of molecular weights 14 Kd, 20 Kd and 26 Kd. These 
polypeptides were determined to have the following N-terminal sequences, respectively: 
(j) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-Lys-Val-Thr-Asp-Ala- 

Ser, (SEQIDNO: 129) 
(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-.\la- 
20 Asp; (SEQ ID NO: 130) and 

(1) Ala-Pro-Glu-Ser-Gly-.\la-Gly-Leu-Gly-GIy-Thr-Val-Gb-Ala- 
Gly; (SEQ ID NO: 131), wherein Xaa may be any amino acid- 
Using the assays described above, these polypeptides were shown to induce 
proliferation and DFN-y production in PBMC preparations. Figs. lA and B show the 
25 results of such assays using PBMC preparations from a first and a second donor, 
respectively. 

DNA sequences that encode the antigens designated as (a), (c), (d) and 
(g) above were obtained by screening a M tuberculosis genomic library using ^^P end 
labeled degenerate oligonucleotides corresponding to the N-terminal sequence and 
30 containing M. tuberculosis codon bias. The screen performed using a probe 
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corresponding to antigen (a) above identified a clone having the sequence provided in 
SEQ ID NO: 96. The polypeptide encoded by SEQ ED NO: 96 is provided m SEQ ID 
NO: 97. The screen performed using a probe corresponding to antigen (g) above 
identified a clone having the sequence provided in SEQ ID NO: 52. The polypeptide 

5 encoded by SEQ ID NO: 52 is provided in SEQ ID NO: 53. The screen performed 
using a probe corresponding to antigen (d) above identified a clone having the sequence 
provided in SEQ ED NO: 24, and the screen performed with a probe corresponding to 
antigen (c) identified a clone having the sequence provided in SEQ ED NO: 25. 

The above amino acid sequences were compared to known amino acid 

10 sequences in the gene bank using the DNA STAR system. The database searched 
contains some 173,000 protems and is a combination of the Swiss, PIR databases along 
with translated protein sequences (Version 87). No significant homologies to the amino 
acid sequences for antigens (a)-{h) and (1) were detected. 

The amino acid sequence for antigen (i) was found to be homologous to 

15 a sequence fi-om M. leprae. The fiill length M leprae sequence was amplified from 
genomic DNA using the sequence obtained from GENBANK. This sequence was then 
used to screen an M tuberculosis library and a fixll length copy of the M tuberculosis 
homologue was obtained fSEQ ID NO: 94). 

Tlie ammo acid sequence for antigen (j) was found to be homologous to 

20 a known M. tuberculosis protein translated from a DNA sequence. To the best of the 
inventors' knowledge, this protem has not been previously shown to possess T-cell 
stimulatory activity. The amino acid sequence for antigen (k) was found to be related to 
a sequence from M. leprae. 

In the proliferation and IFN-y assays described above, usmg three PPD 

25 positive donors, the results for representative antigens provided above are presented m 
Table 1: 
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TART.F 1 



Sequence 


Proliferation 


EFN-y 


(a) 


+ 




(c) 




+++ 


(d) 






(g) 


+++ 




(h) 




-HH- 



10 



In Table 1, responses that gave a stimuiation index (SI) of between 2 and 
4 (compared to cells cultured in medium alone) were scored as +, as SI of 4-8 or 2-4 at a 
concentration of 1 ng or less was scored as and an SI of greater than 8 was scored as 
The antigen of sequence (i) was found to have a high SI (-++) for one donor and 
lower SI (-^ and -) for the two other donors m both proUferation and IFN-y assays. 
These results mdicate that these antigens are capable of inducing proliferation and/or 
interferon-y production. 



15 



EXAMPTF^ 

Use Of P^TTFNTSRR^ ToTsm ^TF.M n/fl^ffrf/. p .9/.vA^mr.FM<! 

This example illustrates the isolation of antigens from M. tuberculosis 
lysate by screemng with scrum from M. tuberculosis-infected individuals. 

Dessicated M. tuberculosis H37Ra (Difco Laboratones) was added ta a 
2% NP40 solution, and alternately homogemzed and sonicated three times. The 
resulting suspension was centnfuged at 13,000 rpm m microfiige tubes and the 
supernatant put through a 0.2 micron syringe filter. The filtrate was bound to Macro 
Prep DE.\E beads (BioRad, Hercules, CA). The beads were extensively washed with 
20 niM Tns pH 7.5 and bound proteins eluted with IM NaCl. The NaCl elute was 
dialyzed overnight against 10 mM Tns, pH 7.5. Dialyzed solution was treated with 
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DNase and RNase at 0.05 mg/ml for 30 min. at room temperature and then with a-D- 
mannosidase, 0.5 U/mg at pH 4.5 for 3^ hours at room temperature. After returning to 
pH 7.5, the material was fractionated via FPLC over a Bio Scale-Q-20 column 
(BioRad). Fractions were combined into nine pools, concentrated in a Centriprep 10 
i (Amicon, Beverley, MA) and screened by Western blot for serological activity using a 
serum pool from M. tubercuhsis-mfectai patients which was not immunoreactive with 
other antigens of the present invention. 

The most reactive fraction was run in SDS-PAGE and transferred to 
PVDF. A band at approximately 85 Kd was cut out yielding the sequence: 

(m) Xaa-Tyr-ne-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys- 
Ile-Asn-Val-His-Leu-Val; (SEQ ID NO: 132). wherein Xaa may 
be any amino acid. 

Comparison of this sequence with those in the gene bank as described 
above, revealed no significant homologies to known sequences. 

A DNA sequence that encodes the antigen designated as (m) above was 
obtained by screening a genomic M. tuberculosis Erdman strain library usmg labeled 
degenerate oligonucleotides corresponding to the N-terminal sequence of SEQ ID 
NO:137. A clone was identified having the DNA sequence provided m SEQ ID NO: 
198. Tills sequence was found to encode the ammo acid sequence provided in SEQ ID 
NO: 199. Corapanson of these sequences with those in the genebank revealed some 
sunilanty to sequences previously identified m M. tuberculosis and M. bovis. 

EXAMPT F 

This example illustrates the preparation of DNA sequences encoding 
M. tuberculosis antigens by screening a M. tuberculosis expression library with sera 
obtamed from patients infected with M tuberculosis, or with anti-sera raised against 

M. tuberculosis antigens. 
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A. 



Genomic DNA was isolated from the M tuberculosis stram H37Ra The 
DNA was randomly sheared and used to construct an expression library usmg the 
5 Lambda ZAP expression system (Stxatagene, La Jolla, CA). Rabbit anti-sera was 
generated agamst secretory proteins of the M. tuberculosts strains H37Ra, H37Rv and 
Erdman by immumzmg a rabbit with concentrated supernatant of the M. tuberculosis 
cultures. SpecificaUy, the rabbit was first immunized subcutaneously with 200 fxg of 
protem antigen m a total volume of 2 ml containing 100 ,g muramyl dipeptide 
(Calbiochem. La Jolla. CA) and 1 ml of incomplete Freund's adjuvant. Four weeks later 
the rabbit was boosted subcutaneously with 100 ^ig antigen in incomplete Freund's 
adjuvant. Finally, the rabbit was immunized intravenously four weeks later with 50 ^g 
protem antigen. The anti-sera were used to screen the expression library as described in 
Sambrook etal.. Molecular Clomng: A Laboratory Manual, Cold Spnng Harbor 
Laboratones. Cold Spnng Harbor, NY. 1989. Bactenophage plaques expressing 
immunoreactive antigens were purified. Phagemid from the plaques was rescued and 
the nucleotide sequences of the M. tuberculosis clones deduced. 

Thiny two clones were purified. Of these. 25 represent sequences that 
have not been previously identified in M. tuberculosis. Protems were mduced bv IPTG 
and punfied by gel eiution. as described m Skeiky et al., J. Exp. Med. /5/:I527-I537. 
1995. Representative parriai sequences of DNA molecules identified m this screen are 

provided in SEQ ID NOS: 1-25. The corresponding predicted ammo acid sequences are 
shown m SEQ ID NOS. 64-88. 

On companson of these sequences with known sequences in the -ene 
bank usmg the databases described above, it was found that the clones referred to 
hereinafter as TbRA2A, TbRA16, TbRAlS, and TbRA29 (SEQ ID NOS: 77. 69, 71. 
76) show some homology to sequences previously identified in Mycobactenum leprae 
but not in M tuberculosis. TbRA2A was found to be a lipoprotein, with a six residue 
hpidation sequence being located adjacent to a hydrophobic secretorv sequence 
TbR^Ml, TbRA26. TbRA2S and TbDPEP (SEQ ID NOS: 66, 74, 75, 53) have been 
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previously identified m M. mberculosis. No significant homologies were found to 
TbRAl. TbRA3. TbRA4, TbRA9. TbRAlO, TbRA13, TbRAI7, TbRA19, TbRA29, 
■n)RA32, TbRA36 and the overlapping clones TbRA35 and TT^RAP (SEQ ID 
NOS: 64. 78, 82, 83. 65, 68, 76, 72, 76. 79. 81, 80. 67. respectively), n^e clone 
5 TbRa24 is overl^ping with clone TbRa29. 

The genomic DNA Ubraiy described above, and an additional H37Rv 
library, were screened using pools of sera obtained from patients with active 
tuberculosis. To prepare the H37Rv library, M. tuberculosa strain H37Rv genomic 
DNA was isolated, subjected to panial Sau3A digestion and used to construct an 
expression libraiy using the Lambda Zap expression system (Stratagene, La Jolla. Ca). 
Three different pools of sera, each contaimng sera obtained from three individuals with 
active puhnonary or pleural disease, were used m the expression screening. TTie pools 
were designated TbL, TbM and TbH, refemng to relative reactivity with H37Ra lysate 
U.e., TbL = low reactivity, TbM = medium reacnvity and TbH = high reactivity) in both 
ELISA and immunobiot format. A fourth pool of sera from seven patients with active 
pulmonary tuberculosis was also employed. All of the sera lacked increased reactivitv 
with the recombmant 38 kD M. tuberculosis H37Ra phosphate-binding protem. 

All pools were pre-adsorbed with E. coli lysate and used to screen the 
H37Ra and H37Rv expression libranes. as descnbed .n Sambrook et al., Molecular 
Clomng: A Laboratory Manual, Cold Sprmg Harbor Laboratones. Cold Spnng Harbor. 
NY, 1989. Bacteriophage plaques expressmg immunoreacnve antigens were punfied. 
Phagemid fi-om the plaques was rescued and the nucleotide sequences of the 
M. tuberculosis clones deduced. 

Thiny two clones were purified. Of these. 31 represented sequences that 
had not been previously identified in human M tuberculosis. Representative sequences 
of the DNA molecules identified are provided m SEQ ID NOS:: 26-51 and 100 Of 
these. TbH-8-2 (SEQ. ID NO. 100) ,s a partial clone of TbH-8. and TbH-4 (SEQ ID 
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NO. 43) and TbH^FWD (SEQ. ID NO. 44) are non-contiguous sequences from the 
same clone. Amino acid sequences for the antigens hereinafter identified as Tb38-I, 
TbH^, TbH-8, TbH-9, and TbH-12 are shown in SEQ ID NOS.: 89-93. Companson 
of these sequences with known sequences in the gene bank using the databases 
5 identified above revealed no significant homologies to TbH-4, TbH-8, TbH-9 and 
TbM-3, although weak homologies were found to TbH-9. TbH-12 was found to be 
homologous to a 34 kD antigenic protein previously identified in M paratuberculosis 
(Acc. No. S28515). Tb38-1 was found to be located 34 base pairs upstream of the open 
reading frame for the antigen ESAT-6 previously identified in M bovis (Acc. 
10 No. U34848) and in M, tuberculosis (Sorensen et al., Infec. Immun. {55:1710-1717. 
1995). 

Probes derived from Tb38-1 and TbH-9, both isolated from an H37Ra 
library, were used to identify clones in an H37Rv library. Tb38-1 hybridized to 
Tb38-lF2,Tb38-lF3,Tb38-lF5 and Tb38-1F6 (SEQ. ID NOS: 107, 108, 111, 113, and 

15 114). (SEQ ID NOS: 107 and 108 are non-contiguous sequences from clone Tb38- 
1F2.) Two open reading frames were deduced in Tb38-IF2; one corresponds to Tb37FL 
(SEQ. ID. NO. 109), the second, a partial sequence, may be the homologue of Tb38-1 
and is called Tb38-IN (SEQ. ID NO. 1 10). The deduced amino acid sequence of Tb38- 
1F3 is presented m SEQ. ID. NO. 112. A TbH-9 probe identified three clones in the 

20 H37Rv library: TbH-9-FL (SEQ. ID NO. 101), which may be the homologue of TbH-9 
(R37Ra), TbH-9- 1 (SEQ. ID NO. 103), and TbH-8-2 (SEQ. ID NO. 105) is a partial 
clone of TbH-8. The deduced amino acid sequences for these three clones are presented 
in SEQ ID NOS: 102, 104 and 106. 

Further screenmg of the M. tuberculosis genomic DNA library, as 

25 described above, resulted m the recovery of ten additional reactive clones, representing 
seven different genes. One of these genes was identified as the 38 Kd antigen discussed 
above, one was determined to be identical to the 14Kd alpha crystallin heat shock 
protein previously shown to be present in M. tuberculosis, and a third was determined 
to be identical to the antigen TbH-8 described above. The detenmned DNA sequences 

30 for the remammg five clones (hereinafter referred to as TbH-29, TbH-30, TbH-32 and 
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TbH-33) are provided m SEQ ID NO: 133-136, respectively, with the corresponding 
predicted amino acid sequences being provided in SEQ ID NO: 137-140, respectively. 
The DNA and amino acid sequences for these antigens were compared with those in the 
gene bank as described above. No homologies were found to the 5' end of TbH-29 
(which contains the reactive open reading frame), although the 3' end of TbH-29 was 
found to be identical to the M tuberculosis cosmid Y227. TbH-32 and TbH-33 were 
found to be identical to the previously identified M, tuberculosis insertion element 
IS61 10 and to the M tuberculosis cosmid Y50, respectively. No significant homologies 
to TbH-30 were found. 

Positive phagemid from this additional screeiung were used to infect E. 
coll XL-1 Blue MRF\ as described in Sambrook et al., supra. Induction of recombmant 
protein was accomplished by the addition of IPTG. Induced and uninduced lysates 
were run in duphcate on SDS-PAGE and transferred to nitrocellulose filters. Filters 
were reacted with human M tuberculosis sera (1:200 dilution) reactive with TbH and a 
rabbit sera (1:200 or 1:250 dilution) reactive with the N-terminal 4 Kd portion of lacZ. 
Sera incubations were performed for 2 hours at room temperature. Bound antibody was 
detected by addition of '-^I-labeled Protein A and subsequent exposure to fibn for 
variable times ranging from 16 hours to 11 days. The results of the immunoblois are 
summarized in Table 2. 

table: 



Human iM. tb Anti-lacZ 

TbH-29 45 Kd 45 Kd 

TbH-30 No reactivity 29 Kd 

TbH-32 12 Kd 12 Kd 

TbH-33 16 Kd 16 Kd 



Positive reaction of the recombinant human M tuberculosis antigens 
with both the human M. tuberculosis sera and anti-lacZ sera indicate that reactivity of 
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the human M. tuberculosis sera is directed towards the fusion protein. Antigens 
reactive with die anti-lacZ sera but not with the human M. tuberculosis sera may be die 
result of die human M. tuberculosis sera recognizing confomiational epitopes, or die 
anrigen-antibody binding kinetics may be such diat die 2 hour sera exposure in die 
immunoblot is not sufficient. 

Studies were undertaken to determine whedier die antigens TbH-9 and 
Tb38-1 represent cellular proteins or are secreted into M. tuberculosis culture media. In 
die first smdy, rabbit sera were raised against A) secretory protems of .Vf. tuberculosis, 
B) die known secretory recombinant M tuberculosis antigen 85b. C) recombinant 
Tb38-1 and D) recombmant TbH-9. using protocols substantially as described in 
Example 3A. ToialM. tuberculosis lysate, concentrated supernatant ofM. tuberculosis 
cultures and the recombinant antigens 85b, TbH-9 and Tb38-1 were resolved on 
denatunng gels, immobilized on mtrocellulose membranes and duplicate blots were 
probed using die rabbit sera described above. 

The results of diis analysis using control sera (panel I) and antisera 
(panel U) agamst secretory proteins, recombinant 85b, recombinant Tb38-1 and 
recombmant TbH-9 are shown m Figures 2A-D, respectively, wherein die lane 
designations are as follows: 1) molecular weight protein standards: 2) 5 ug of M. 
tuberculosis lysate: 3) 5 ug secretory proteins; 4) 50 ng recombinant Tb38-1; 5) 50 ng 
recombmant TbH-9; and 6) 50 ng recombinant 85b. The recombmant antigens were 
engineered with six tennmal histidine rtsidues and would dierefore be expected to 
migrate with a mobility approxmiateiy I kD larger that the native protein. In Figure 
2D, recombinant TbH-9 is lacking approximately 10 kD of the full-lengdi 42 kD 
antigen, hence die significant difference in die size of the unmunoreactive native TbH;9 
antigen m die lysate lane (mdicated by an anow). TTiese results demonstrate diat Tb38- 
1 and TbH-9 are intracellular antigens and are not actively secreted by M. tuberculosis. 

The finding that TbH-9 is an intracellular antigen was confirmed by 
determming die reactivity of TbH-9-specific human T cell clones to recombinant TbH- 
9, secretory M. tuberculosis proteins and PPD. A TbH-9-specific T cell clone 
(designated 131TbH-9) was generated from PBMC of a healdiy PPD-positive donor. 
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Th. proliferative response of 131TbH-9 to secretory proteins, recombmant TbH-9 and a 
control M. tuberculosis antigen, TbRall. was determined by measuring uptake of 
tritiated thymidine, as described in Example 1. As shown in Figure 3 A, the clone 
131TbH-9 responds specifically to TbH-9, showiiig that TbH-9 is not a significant 
component of M tuberculosis secretory proteins. Figure 3B shows the production of 
IFN-y by a second TbH-9-specific T cell clone (designated PPD 800-10) prepared from 
PBMC from a healthy PPD-positive donor, following stimulation of the T ceU clone 
with secretory protems, PPD or recombmant TbH-9. These results further confirm that 
TbH-9 is not secreted by M. tuberculosis. 



I DENTIFY PNA SFO^T^^^^«-FNm^,^r■^ vwr rr.n,,, A^-nn.r.^ ^ 

Genomic DNA was isolated from M. tuberculosis Erdman stram 
randomly sheared and used to constmct an expression hbraiy employing the Lambda 
ZAP expression system (Stratagene, La Jolla, CA). The resulting hbraiy was screened 
using pools of sera obtained from individuals with extrapuhnonary tubereulosis, as 
descnbed above in Example 3B. with the secondary antibody being goat anti-human 
IgG - A - M (H+L) conjugated with alkaline phosphatase. 

Eighteen clones were purified. Of these. 4 clones (hereinafter referred to 
as XP14, XP24, XP31 and XP32) were found to bear some smulanty to known 
sequences. Tlie detemimed DNA sequences for XP14, XP24 and XP31 are provided m 
SEQ E) NOS: 151-153. respectively, with the 5' and 3' DNA sequences for XP37 bemg 
provided ,n SEQ ID NOS: 154 and 155. respecnvely. Tk. predicted ammo acid 
sequence for XP14 is provided m SEQ ID NO: 156. The reverse complement of XP14 
was found to encode the ammo acid sequence provided in SEQ ID NO: 157. 

Comparison of the sequences for the remaming 14 clones (hereinafter 
referred to as XP1-XP6, XPI7-XP19. XP22. XP25, XP27. XP30 and XP36) with those 
m the genebank as described above, revealed no homologies with the exception of the 
3' ends of XP2 and XP6 which were found to bear some homology to known M 
tuberculosa cosmids. The DNA sequences for XP27 and XP36 are shown in SEQ ID 
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NOS: 158 and 159, respectively, with the 5' sequences for XP4, XP5, XP17 and XP30 
bemg shown in SEQ ID NOS: 160-163, respectively, and the 5' and 3' sequences for 
XP2, XP3, XP6, XP18, XP19, XP22 and XP25 being shown in SEQ ID NOS: 164 and 
165; 166 and 167; 168 and 169; 170 and 171; 172 and 173; 174 and 175; and 176 and 
177, respectively. XPl was found to overlap with the DNA sequences for TbH4, 
disclosed above. The fixll-iength DNA sequence for TbH4-XPl is provided in SEQ ID 
NO: 178. Tins DNA sequence was found to contain an open reading frame encoding 
the amino acid sequence shown m SEQ ID NO: 179. The reverse complement of 
TbH4-XPl was found to contam an open reading frame encoding the ammo acd 
sequence shown m SEQ ID NO: 180. The DNA sequence for XP36 was found to 
contain two open reading frames encoding the ammo acid sequence shown in SEQ ID 
NOS: 181 and 182, with the reverse complement contaimng an open reading frame 
encoding the amino acid sequence shown in SEQ ID NO: 183. 

Recombinant XPl protein was prepared as described above in Example 
3B. with a metal ion aflSmty chromatography column being employed for purification. 
Recombinant XPl was found to stimulate cell proliferation and EFN-y production m T 
cells isolated from an M. cubercuhsis-unmunt donors. 



Use of ^ I YSATF Pn^mvp ^pp , ,,^ p^p, ppp,,^ p ^ -^p ,^. 



HAVTNr; 



Genomic DNA was isolated from M. tuberculosis Erdman stram. 
randomly sheared and used to construct an expression library employing the Lambda 
Screen expression system (Novagen. Madison, WI), as described below in Example 6. 
Pooled serum obtained from M. tuberculosis-mfa::^ patients and that was shown to 
react with M ruberculosis lysate but not with the previously expressed protems 38kD 
Tb38-1, TbRa3. TbH4, DPEP and TbRal 1, was used to screen the expression librarv as 
descnbed above m Example 3B, with the secondary antibody being goat anti-human 
IgG - A - M (H+L) conjugated with alkaline phosphatase. 

Twenty-seven clones were purified. Comparison of the detemiined 
cDNA sequences for these clones revealed no significant homologies to 10 of the clones 
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(hereinafter referred to as LSER-IO, LSER-Il, LSER-12, LSER-13, LSER-16, LSER- 
18, LSER-23. LSER-24, LSER-25 and LSER-27). Th. detenmned 5' cDNA sequences 
for LSER-10, LSER-11, LSER-12, LSER-13, LSER-16 and LSER-25 are provided in 
SEQ ID NO: 237-242, respectively, with the corresponding predicted amino acid 
sequences for LSER-10, LSER-12, LSER-13, LSER-16 and LSER-25 being provided m 
SEQ ID NO: 243-247, respectively. The determined full-length cDNA sequences for 
LSER- 18, LSER-23, LSER-24 and LSER-27 are shown in SEQ ID NO: 248-251, 
respectively, with the corresponding predicted amino acid sequences bemg provided in 
SEQ ID NO: 252-255. The remaining seventeen clones were found to show 
similarities to unknown sequences previously identified in M. tuberculosis. The 
detenmned 5' cDNA sequences for sixteen of these clones (hereinafter referred to as 
LSER-I, LSER-3, LSER-4, LSER-5, LSER-6, LSER-8, LSER-14. LSER.I5, LSER-17, 
LSER-19, LSER-20. LSER-22, LSER-26, LSER-28, LSER-29 and LSER-30) are 
provided m SEQ ID NO: 256-271. respectively, with the corresponding predicted ammo 
acid sequences for LSER-l, LSER-3. LSER-5, LSER-6, LSER-8, LSER-14, LSER-15, 
LSER-17, LSER-19, LSER-20, LSER-22. LSER-26, LSER-28, LSER-29 and LSER-30 
bemg provided m SEQ ID NO: 272-286, respectively. The determined full-length 
cDNA sequence for the clone LSER-9 is provided in SEQ ID NO: 287. The reverse 
complemem of LSER-6 (SEQ ID NO: 288) was found to encode the predicted amino 
acid sequence of SEQ ID NO: 289. 

^ PR E PARATION op M , TlWRniosis So^■lmrF AM-nr.P N .s mi^n Rarr,t 

SERA RAIDED r^n^FNST M . HT^FROn qsts FRArnnKiATFn PbptFT> '^ 

M. tuberculosis lysate was prepared as described above in Example 2. 
The resulting material was fractionated by HPLC and the fractions screened by Western 
blot for serological activity with a serum pool from M. tubercuiosis-mf^ttd patients 
which showed little or no immunoreactivity with other antigens of the present 
invention. Rabbit anti-sera was generated against the most reactive fraction using the 
method described in Example 3A . The anti-sera was used to screen an M tuberculosis 
Erdman stram genomic DNA expression library prepared as descnbed above. 
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Bacteriophage plaques expressing immunoreactive antigens were purified. Phagemid 
fi-om the plaques was rescued and the nucleotide sequences of the M tuberculosis 
clones determined. 

Ten different clones were purified. Of these, one was found to be 
5 TbRa35, described above, and one was found to be the previously identified M 
tuberculosis antigen, HSP60. Of the remaining eight clones, six (hereinafter rcfeired to 
as RDIF2, RDIF5, RDIF8, RDIFIO, RDIFll and RDIF12) were found to bear some 
similarity to previously identified M. tuberculosis sequences. The determined DNA 
sequences for RDIF2, RDEFS, RDIF8, RDIFIO and RDIFll are provided in SEQ ID 

10 NOS: 184-188, respectively, with the corresponding predicted amino acid sequences 
bemg provided m SEQ ID NOS: 189-193, respectively. The 5' and 3' DNA sequences 
for RDIF12 are provided in SEQ ED NOS: 194 and 195, respectively. No significant 
homologies were found to the antigen RDIF-7. The detemiined DNA and predicted 
amino acid sequences for RDIF7 are provided in SEQ ED NOS: 196 and 197, 

15 respectively. One additional clone, referred to as RDIF6 was isolated, however, this 
was found to be identical to RDIF5. 

Recombinant RDIF6, RDIF8, RDIFIO and RDEFll were prepared as 
described above. These antigens were found to stimulate cell proliferation and IFN-y 
production in T ceils isolated from M. ruberculosis-immvn^ donors. 

20 



EXAMPLE 4 

PlJRrnCAnON and CHARArTFR[7AnnN OF A P QLYPHPTTDF FROM TimRRrLTLIN PtJRIFTF.D 

PROTFJN nFRTVATTVF 

25 

An M. tuberculosis polypeptide was isolated from tuberculin purified 
protem denvative (PPD) as follows. 

PPD was prepared as published with some modification (Seibert, F. et 
al.. Tuberculin purified protein denvative. Preparation and analyses of a large quantity 
30 for standard. The Amencan Review of Tuberculosis; 44:9-25, 1941). M. tuberculosis 
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Rv strain was grown for 6 weeks in synthetic medium in roller bottles at 37'C. Bottles 
containing the bacterial growth were then heated to lOCC in water vapor for 3 hours. 
Cultures were sterile filtered using a 0.22 n filter and the Uquid phase was concentrated 
20 times using a 3 kD cut-off membrane. Proteins were precipitated once with 50% 
ammonium sulfate solution and eight times with 25% ammonium sulfate solution. The 
resulting proteins (PPD) were fractionated by reverse phase liquid chromatography 
(RP-HPLC) using a CI 8 column (7.8 x 300 mM; Waters, Milford, MA) in a Biocad 
HPLC system (Perseptive Biosystems, Framingham, MA). Fractions were eluted torn 
the column with a linear gradient fcom 0-100% buffer (0.1% TFA in acetonitrile). The 
flow rate was 10 ml/minute and eluent was monitored at 214 nm and 280 nm. 

Six fi:actions were collected, dried, suspended in PBS and tested 
individually in M. tuberculosis-mffxted guinea pigs for induction of delayed type 
hypersensitivity (DTH) reaction. One fraction was found to induce a strong DTH 
reaction and was subsequently fi^tionated fiirther by RP-HPLC on a microbore Vydac 
C18 column (Cat. No. 218TP5115) m a Peridn Ehner/Applied Biosystems Division 
Model 172 HPLC. Fractions were eluted widi a linear gradient from 5-100% buffer 
(0.05% TFA in acetonitrile) with a flow rate of 80 ^l/^linute. Eluent was monitored at 
215 nm. Eight fractions were collected and tested for induction of DTH in M. 
tuberciiiosis-infecied guinea pigs. One fraction was found to induce strong DTH of 
about 16 mm induration. The other fractions did not mduce detectable DTH. The 
positive fraction was submitted to SDS-PAGE gel electrophoresis and found to contam 
a single protein band of approximately 12 kD molecular weight. 

This polypeptide, herem after referred to as DPPD, was sequenced from 
the ammo termmal usmg a Perkm Ehner/Applied Biosystems Division Procise 492 
protein sequencer as described above and found to have the N-tcrminal sequence shown 
m SEQ ID NO: 124. Companson of this sequence with known sequences m the gene 
bank as descnbed above revealed no known homologies. Four cyanogen bromide 
fragments of DPPD were isolated and found to have the sequences shown in SEQ ID 
NOS: 125-128. A subsequent search of the M. tuberculosis genome database released 
by the Instimte for Genomic Research revealed a match of the DPPD panial amino acid 
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sequence with a sequence present within the M. tuberculosis cosmid MTY21C12. An 
open reading frame of 336 bp was identified. The fuJl-length DNA sequence for DPPD 
is provided in SEQ ID NO: 235. with the corresponding fiill-Iength ammo acid 
sequence being provided in SEQ ID NO: 236. 



EXAMPT F S 

USE OF SFR ^ FKOM TTmFffrTir.nsK.n^rTF D MnN^FV^ xn rPF^TTJT^Y 
DNA .SFOTiFNCFS FNcnnmn m rrrm cmn^^T^ >^^^r,Y^?^ 

Genomic DNA was isolated from M. tuberculosis Erdman stram. 
randomly sheared and used to construct an expression library employmg the Lambda 
ZAP expression system (Stratagene. La Jolla, CA). Serum samples were obtained from 
a cynomolgous monkey 18, 33, 51 and 56 days following infection w.th M. tuberculosis 
Erdman stram. These samples were pooled and used to screen the M. tuberculosis 
genomic DNA expression library using the procedure described above in Example 3C. 

Twenty clones were purified. The determined 5' DNA sequences for the clones 
referred to as MO-1. MO-2, MO-I, MO-8. MO-9. MO-26, MO-28, MO-29. MO-30. 
MO-34 and MO-35 are provided in SEQ ID NO: 210-220. respectively, with the 
coiresponding predicted amino acid sequences being provided in SEQ ID NO: 22 1-23 1 
The ftiil-length DNA sequence of the clone MO-10 is provided in SEQ ID NO: 232. 
with the corresponding predicted ammo acid sequence being provided m SEQ ID NO: 
233. The 3' DNA sequence for the clone MO-27 is provided in SEQ ID NO: 234. 

Clones MO-1, MO-30 and MO-35 were found to show a high degree of 
reiatedness and showed some homology to a previously identified unknown M. 
tuberculosis sequence and to cosmid MTCI237. MO-2 was found to show some 
homology to aspartokmase from M tuberculosis. Clones MO-3, MO-7 and MO-27 
were found to be identical and to show a high degree of reiatedness to MO-5. All four 
of these clones showed some homology to M. tuberculosis heat shock protem 70. MO- 
27 was found to show some homology to S4. tuberculosis cosmid MTCY339. MO-4 
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10 



and MO-34 were found to show some homology to cosmid SCY21B4 and M. 
smegmatis mtegration host factor, and were both found to show some homology to a 
previously identified, unknown M. tuberculosis sequence. MO-6 was found to show 
some homology to M tuberculosis heat shock protein 65. MO-8, MO-9, MO- 10, MO- 
26 and MO-29 were found to be highly related to each other and to show some 
homology to M. tuberculosis dihydrolipamide succinyltransferase. MO-28, MO-31 and 
MO-32 were found to be identical and to show some homology to a previously 
identified M. tuberculosis protein. MO-SS was found to show some homology to a 
previously identified 14 kDa M. tuberculosis heat shock protein. 

Further smdies using the above protocol resulted in the isolation of an 
additional four clones, hcremafter referred to as MO-12, MO-13, MO-19 and MO-39. 
The determined 5' cDNA sequences for these clones are provided in SEQ ID NO: 290- 
293, respectively, with the corresponding predicted protein sequences being provided in 
SEQ ID NO: 294-297. respectively. Comparison of these sequences with those in the 
gene bank as described above revealed no significant homologies to MO-39. MO-12, 
MO-13 and MO-19 were found to show some homologies to unknown sequences 
previously isolated from M. tuberculosis. 



EXAMPT F f, 

ISO L ATION OF PNA ^FO^;F,N^FS FNrnnTNr. A/ T, mp^r,,^ n;^j^ ^ ^ ..^n^^ .,. 
SYSCRf-FNINOnF A NnvFT FvppF .ssinN T inp^py 

This example illustrates isolation of DNA sequences encoding M. 
tuberculosis antigens by screenmg of a novel expression library with sera from M. 
tuberculosis-mfcctzd patients that were shown to be unreactive with a panel of the 
recombinant M. tuberculosis antigens TbRal 1, TbRa3. Tb38-1, TbH4, TbF and 38 kD. 

Genomic DNA from M. tuberculosis Erdman strain was randomly 
sheared to an average size of 2 kb. and blunt ended with Klenow polymerase, followed 
by the addition of EcoRJ adaptors. The insert was subsequently ligaied mto the Screen 
phage vector (Novagen, Madison, WJ) and packaged m vitro using the PhageMaker 
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extract (Novagen). The resulting library was screened with sera ftom several M 
tuberculosis donors that had been shown to be negative on a panel of previously 
identified M tuberculosis antigens as described above in Example 3B. 

A total of 22 different clones were isolated. By comparison, screening of 
the AZap library described above using the same sera did not result in any positive hits. 
One of the clones was found to represent TbRal 1, described above. The determined 5' 
cDNA sequences for 19 of the remaining 21 clones (hereinafter referred to as Erdsnl, 
Erdsn2, Erdsn4-ErdsnlO, Erdsnl2-18, Erdsn21-Erdsn23 and Erdsn25) are provided in 
SEQ ID NO: 298-317, respectively, with the determined 3' cDNA sequences for 
Erdsnl, Erdsn2, Erdsn4, Erdsn5, Erdsn7-Erdsnl0, Erdsnl 2-Erdsnl 8. Erdsn21-Erdsn23 
and Erdsn25 being provided m SEQ ID NO: 318-336, respectively. The complete 
cDNA insert sequence for the clone Erdsn24 is provided in SEQ ID NO: 337. 
Comparison of the determined cDNA sequences with those in the gene bank revealed 
no significant homologies to the sequences provided in SEQ ID NO: 304, 311,313-315, 
317, 319, 324, 326, 329, 331, 333, 335 and 337. The sequences of SEQ ID NO: 298- 
303, 305-310, 312, 316, 318, 320-321, 324-326, 328, 330, 332, 334 and 336 were found 
to show some homology to unknown sequences previously identified in M. 
tuberculosis. 



EXAMPLE 7 

Isolation OF SoLU-BF.F M TimfRnn.n^i^ AMrrnPMc; 
Using Mas.s S pectromftpv 



This example illustrates the use of mass specn-ometry to identify soluble 
iVf. tuberculosis antigens. 

In a first approach, M tuberculosis culture filtrate was screened by 
Western analysis usmg serum fi-om a tuberculosis-infected individual. The reactive 
bands were excised from a silver stamed gel and the ammo acid sequences determined 
by mass spectrometry. The determined ammo acid sequence for one of the isolated 
antigens is provided m SEQ ID NO: 338. Companson of this sequence with those m 
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the gene bank rcvcaied homology to the 85b precursor antigen previously identified in 
M tuberculosis. 

In a second approach, the high molecular weight region of M. 
tuberculosis culture supernatant was studied. This area may contain immunodominant 
antigens which may be useful in the diagnosis of M tuberculosis infection. Two known 
monoclonal antibodies, IT42 and ITS? (available from the Center for Disease Control, 
Atlanta, GA), show reactivity by Western analysis to antigens in this vicinity, although 
the identity of the antigens remains unknown. In addition, unknown high-molecular 
weight proteins have been described as containing a surrogate marker for M 
tuberculosis infection in HIV-positive individuals {JnL Infect, Dis,. 775:133-143. 1997). 
To determme the identity of these antigens, two-dimensional gel eiecnrophoresis and 
two-dimensional Western analysis were performed using the antibodies IT57 and IT42. 
Five protein spots in the high molecular weight region were identified, individually 
excised, enzymatically digested and subjected to mass spectrometric analysis. 

The determined amino acid sequences for three of these spots (referred to 
as spots I, 2 and 4) are provided in SEQ ID NO: 339, 340-341 and 342, respecnvely. 
Comparison of these sequences with those in the gene bank revealed that spot 1 is the 
previously identified PcK-1. a phosphoenolpyruvaie kinase. The two sequences 
isolated from spot 2 were determined to be from two DNAks, previously identified m 
M, tuberculosis as heat shock proteins. Spot 4 was determined to be the previously 
identified M tuberculosis protein Kat G. To the best of the inventors' knowledge, 
neither PcK-1 nor the two DNAks have previously been shown to have uiilitv m the 
diagnosis of M tuberculosis mfection. 

EXAMPT.F R 
SYNTHF.srs OF Synthfttc Pot vpfpttpf,^ 

Polypeptides may be synthesized on a Millipore 9050 peptide 
synthesizer usmg FMOC chemistry with HPTU (0-Benzotnazole-N,N,N\N'- 
tetramethyluronium hexafluorophosphate) activation. A Gly-Cys-Gly sequence may be 
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attached to the amino tennintis of the peptide to provide a method of conjugation or 
labeling of the peptide. Cleavage of the peptides from the solid support may be carried 
out using the following cleavage mixture: trifluoroacetic 
acid:etiianedithiol:thioanisole:watenphenol (40:1:2:2:3). After cleaving for 2 hours, the 
5 peptides may be precipitated in cold methyl-t-butyl-ether. The peptide pellets may then 
be dissolved in water containing 0.1% trifluoroacetic acid (TFA) and lyophilizcd prior 
to purification by C18 reverse phase HPLC. A gradient of 0-60% acetonitrile 
(containing 0.1% TFA) in water (containing 0.1% TFA) may be used to elute the 
peptides. Following lyophilization of the pure fractions, the peptides may be 
10 characterized using electrospray mass spectrometry and by amino acid analysis. 

This procedure was used to synthesize a TbM-1 peptide that coniams one 
and a half repeats of a TbM-1 sequence. The TbM-1 peptide has the sequence 
GCGDRSGGNLDQIRLRRDRSGGNL (SEQ ID NO: 63). 

15 

EXAMPLE 9 

Use QF RliPRF.SFNTATTVF ANTTCRN.S FOR SfR QDIAGNOSIS OF TUBERCIK^OSTS 

This Example illustrates the diagnostic properaes of several 
20 representative antigens. 

Assays were performed in 96-weIl plates were coated with 200 ng 
antigen diluted to 50 uL m carbonate coating buffer, pH 9.6. The wells were coated 
overmght at 4°C (or 2 hours at 37°C). The plate contents were then removed and the 
wells were blocked for 2 hours with 200 ^L of PBS/ 1% BSA. After the blocking step, 
25 the wells were washed five times with PBS/0.1% Tween 20™. 50 ^L sera, diluted 
1:100 in PBS/0.1% Tween 20^/0.1% BSA, was then added to each well and incubated 
for 30 minutes at room temperature. The plates were then washed again five times with 
PBS/0.1% Tween 20™. 

The enzyme conjugate (horseradish peroxidase - Protein A, Zymed, San 
30 Francisco, CA) was then diluted 1 : 10,000 m PBS/0.1% Tween 20™/0.1% BSA, and 50 
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|iL of the diluted conjugate was added to each well and incubated for 30 minutes at 
room temperature. Following incubation, the wells were washed five times with 
PBS/0.1% Tween 20™. 100 |iL of tetramethylbenzidine peroxidase (TMB) substrate 
(Kirkegaard and Perry Laboratories, Gaithersburg, MD) was added, undiluted, and 
5 mcubated for about 15 minutes. The reaction was stopped with the addition of 100 |iL 
of 1 N H-;S04 to each well, and the plates were read at 450 nm. 

Figure 4 shows the ELISA reactivity of two recombinant antigens 
isolated usmg method A in Example 3 (TbRa3 and TbRa9) with sera from 
M tuberculosis positive and negative patients. The reactivity of these antigens is 

10 compared to that of bacterial lysate isolated from M. tuberculosis sn-ain H37Ra ( Difco, 
Detroit, MI). In both cases, the recombinant antigens differentiated positive from 
negative sera. Based on cut-off values obtained from receiver-operator curves, TbRa3 
detected 56 out of 87 positive sera, and TbRa9 detected 1 1 1 out of 165 positive sera. 

Figure 5 illustrates the ELISA reactivity of representative antigens 

15 isolated using method B of Example 3. The reactivity of the recombinant antigens 
TbH4, TbH12, Tb38-1 and the peptide TbM-1 (as described in Example 4) is compared 
to that of the 38 kD antigen described by Andersen and Hansen, Infect, Immun. 
57:2481-2488, 1989. Again, all of the polypeptides tested differentiated positive from 
neganve sera. Based on cut-off values obtained from receiver-operator curves, TbH4 

10 detected 67 out of 126 posinve sera, TbH12 detected 50 out of 125 positive sera, 38-1 
detected 61 out of 101 positive sera and the TbM-1 peptide detected 25 out of 30 
positive sera. 

The reactiviry of four antigens (TbRa3. TbRa9, TbH4 and TbH12) with 
sera from a group of M tuberculosis infected patients with diffenng reactivity in the 
25 acid fast stain of sputum (Smithwick and David, Tubercle 52:226, 1971) was also 
exammed, and compared to the reactivity of M tuberculosis lysate and the 38 kD 
antigen. The results are presented in Table 3, below: 
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TART.F 1 

REACTTVITY Qf ANTIOFNS wtth Skra from M r uBERnnmiKP !.t^>,j^ 





Acid 
Fast 

Sputum 


ELISA Values 


Lysate 38kD TbRa9 TbH12 TbH4 TbRa3 




1 1 1 I 

T t r-r 


1.853 


0.634 


0.998 


1.022 


1.030 


1.314 






2.657 


2.322 


0.608 


0.837 


1.857 


2.335 


1 \j\J ixjy Dl-q 


1 1 j 


2.703 


0.527 


0.492 


0.281 


0.501 


2.002 






1.665 


1.301 


0.685 


0.216 


0.448 


0.458 


TbOIB93I-ll 




2.817 


0.697 


0.509 


0.301 


0.173 


2.608 


Tb01B93I-15 


■HH- 


1.28 


0.283 


0.808 


0.218 


1.537 


0.811 


Tb01B93I-16 




2.908 


>3 


0.899 


0.441 


0.593 


1.080 


Tb01B93I-25 




0.395 


0.131 


0.335 


0.211 


0.107 


0.948 


Tb01B93I-87 




2.653 


2.432 


2.282 


0.977 


1.221 


0.857 


Tb01B93I-89 




1.912 


2.370 


2.436 


0.876 


0.520 


0.952 


ib01B94I-l08 |- — 

1 


1.639 


0.341 


0.797 


0.368 


0.654 0.798 


l'b01B94I-201 




1.721 


0.419 


0.661 


0.137 


0.064 


0.692 


Tb01B93I-88 




1.939 


1.269 


2.519 


1.381 


0.214 


0.530 


Tb01B93I-92 




2.355 


2.329 


2.78 


0.685 


0.997 


2.527 


Tb01B94I-109 




0.993 


0.620 


0.574 


0.441 


0.5 


2.558 


Tb01B94I-210 




2.777 


>3 


0.393 


0.367 


1.004 


1.315 


Tb01B94I-224 




2.913 


0.476 


0.251 


1.297 


1.990 


0.256 


Tb01B93I-9 




2.649 


0.278 


0.210 


0.140 


0.181 


1.586 


Tb01B93I-14 




>3 


1.538 


0.282 


0.291 


0.549 


2.880 


Tb01B93I-21 




2.645 


0.739 


2.499 


0.783 


0.536 


1,770 
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Patient 


Acid 
Fast 

Sputum 


EUSA Values 


Lysate 38kD TbRa9 'rbH12 TbH4 TbRa3" 


Tb01B93I-22 


■r 


0.714 


0.451 


2.082 


0.285 


0.269 


1.159 


TbOIB93I-31 


+ 


0.956 


0.490 


1.019 


0.812 


0.176 


1.293 


Tb01B93I-32 


- 


2261 


0.786 


0.668 


0.273 


0.535 


0.405 


Tb01B93I-52 


- 


0.658 


0.114 


0.434 


0.330 


0.273 


1.140 


Tb01B93I-99 


- 


2.118 


0.584 


1.62 


0.119 


0.977 


0.729 


Tb01B94I-130 


- 


1.349 


0.224 1 0.86 


0.282 


0.383 


2.146 


Tb01B94I-131 


- 


\J*OOJ 


0.324 


1.173 


0.059 


0.118 


1.431 


AT4-0070 


Nomal 


0.072 


0.043 


0.092 


0.071 


0.040 


0.039 


AT4-0105 


Normal 


0.397 


0.121 


0.118 


0.103 


0.078 


0.390 


3/15/94-1 


Normal 


0.227 


0.064 


0.098 


0.026 


0.001 


0.228 


4/15/93-2 


Normal 


0.114 


0.240 


0.071 


0.034 


0.041 


0.264 


5/26/94-4- 
5/26/94-3 


Normal 
Normal 


0.089 
0.139 1 


0.259 
0.093 


0.096 
0.085 1 


0.046 
0.019 j 


0.008 
0.067 


0.053 
0.01 



Based on cut-off values obtamed from receiver-operator curves, TbRa3 
detected 23 out of 27 positive sera, TbRa9 detected 22 out of 27, TbH4 detected 18 out 
of 27 and TbH12 detecred 15 out of 27. If used m combmation, these four antigens 
would have a theoretical sensztmty of 27 out of 27, indicatmg that these antigens 
should complement each other m the serological detection ofM. tuberculoses infection. 
In addition, several of the recombmant antigens detected positive sera that were not 
detected usmg the 38 kD antigen, indicatmg that these antigens may be complementary 
to the 38 kD antigen. 

The reactivity of the recombmant antigen TbRall with sera from 
M. tuberculoses patients shown to be negative for the 38 kD antigen, as well as with sera 
from PPD positive and nonnai donors, was determmed by ELISA as descnbed above 
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10 



The results are shown in Figure 6 which indicates that TbRall, while being negative 
with sera from PPD positive and nonnal donors, detected sera that were negative with 
the 38 kD antigen. Of the thirteen 38 kD negative sera tested, nine were positive with 
TbRal I, indicating that this antigen may be reacting with a sub-group of 38 kD antigen 
negative sera. In contrast, in a group of 38 kD positive sera where TbRall was 
reactive, the mean OD 450 for TbRal 1 was lower than that for the 38 kD antigen. The 
data mdicate an inverse relationship between the presence of TbRal 1 activity and 38 kD 
positivity. 

TTie antigen TbRa2A was tested in an indirect ELISA using initially 50 .u 
1 of semm at 1 : 100 dilution for 30 mmutes at room temperamre followed by washmg ,n 
PBS Tween and incubatmg for 30 mmutes with biotinylated Protem A (Zymed San 
Francisco, CA) at a 1:10,000 dilution. FoUowing washmg, 50 m1 of streptavidin- 
horseradish peroxidase (Zymed) at 1:10,000 dilution was added and the mixture 
mcubated for 30 mmutes. After washmg, the assay was developed with TMB substrate 
as descnbed above. The reactivity of TbRa2A with sera from M. tuberculosrs panents 
and normal donors m shown m Table 4. The mean value for reactivity of TbRa2A with 
sera from M. tuberculosis patients was 0.444 with a standard deviation of 0.309. The 
mean for reactivity with sera from normal donors was 0.109 with a standard deviation 
of 0.029. Testmg of 38 kD negative sera (Figure 7) also mdicated that the TbRa2A 
20 antigen was capable of detecting sera m this category. 
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TABLE 4 
Norma I ^)9^fop,^ 



AND FROM 



Serum ID 


Status 


OD 450 


Tb85 


TB 


0.680 


Tb86 


TB 


0.450 


Tb87 


TB 


0.263 


Tb88 


TB 


0.275 


Tb89 


TB 


0.403 


Tb9l 


TB 


0.393 


Tb92 


TB 


0.401 
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1 07 J 


TB 


0.232 


I 074 


TB 


0.333 




TB 


0.435 


loyo 


TB 


0.284 


lay / 


TB 


0.320 


Tb99 


TB 


0.328 


TblOO 


TB 


0.817 


TblOl 


TB 


0.607 


Tbl02 


TB 


0.191 


Tbl03 


TB 


0.228 


TbI07 


TB 


0.324 


TbI09 


TB 


1.572 


Iblli 1 TB o^^s 


DL4-0176 


Noraial j 0.036 


AT4-0043 


Normal | 0.126 


AT4-0044 


Nonnal 0.130 


AT4-0052 


Normal 


0.135 


AT4-0053 


Normal 


0.133 


AT4-0062 


Normal 


0.128 


AT4-0070 


Normal 


0.088 


AT4-0091 


Normal 


0.108 


AT4-0100 


Normal 


0.106 


AT4-0105 


Normal | 


0.108 


AT4-0109 


Nonnal 


0.105 



The reactivity of the recombinant antigen (g) (SEQ ED NO: 60) with sera 
from M. tuberculosis patients and nonnal donors was detennined by ELISA as 
described above. Figure 8 shows the results of the titration of antigen (g) with four 
M. tuberculosis positive sera that were all reactive with the 38 kD antigen and widi tour 
donor sera. All four positive sera were reactive with antigen (g). 

The reactivity of the recombinant antigen TbH-29 (SEQ ID NO: 137) 
with sera from M. tuberculoses patients, PPD positive donors and nonnal donors was 
detemmied by mdirect ELISA as descnbed above. The results are shown m Figure 9. 
TbH-29 detected 30 out of 60 M. tuberculoses sera, 2 out of 8 PPD positive sera and 2 
out of 27 normal sera. 

Figure 10 shows the results of ELISA tests (both direct and indirect) of 
the antigen TbH-33 (SEQ ID NO: 140) wuh sera from M tuberculoses patients and 
from nonnal donors and wuh a pool of sera from M. tuberculoses patients. The mean 
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OD 450 was demonstrated to be higher with sera from M. tuberculosis patients than 
from noimal donors, with the mean OD 450 being significantly higher in the indirect 
ELISA than in the direct ELISA. Figure 11 is a titration curve for the reactivity of 
recombinant TbH-33 with sera from M. tuberculosis patients and from normal donors 
showing an mcrease in OD 450 with increasing concentration of antigen. 

The reactivity of the recombinant antigens RDIF6, RDIF8 and RDIFIO 
(SEQ ED NOS: 184-187. respectively) with sera from M. tuberculosis patients and 
normal donors was determined by ELISA as described above. RDIF6 detected 6 out of 
32 M tuberculosis sera and 0 out of 15 normal sera; RDIF8 detected 14 out of 32 K 
tuberculosis sera and 0 out of 15 normal sera; and RDIFIO detected 4 out of 27 M. 
tuberculosis sera and 1 out of 15 normal sera. In addition, RDIFIO was found to detect 
0 out of 5 sera from PPD-positive donors. 

The antigens MO-1, MO-2, MO-4. MO-28 and MO-29 described above 
in Example 5, were expressed m E. coli and purified using a hexahistidine tag. The 
reactivity of these antigens with both M. tuberculosis positive and negative sera was 
examined by ELISA as described above. Titration curves showing the reactivity of 
MO-1, MO-2, MO-4, MO-28 and MO-29 at different soUd phase coat levels when 
tested against four M tuberculosis positive sera and four M. tuberculosis negative sera 
are shown m Figs. 12A-E, respectively. Three of the clones. MO-1. MO-2 and MO-29 
were fiirther tested on panels of HIV positive/tuberculosis (HIV/TB) positive and 
extrapuhnonary sera. MO-1 detected 3/20 extrapuhnonary and 2/38 HIV/TB sera. On 
the same sera groups. MO-2 detected 2.'20 and 10/38, and MO-29 detected 2/20 and 
8/38 sera. In combmation these three clones would have detected 4/20 exttapulmonary 
sera and 16/38 fflV/TB sera. In addition, MO-1 detected 6/17 sera that had previously 
been shown only to react with M. tuberculosis lysate and not with either 38 kD or with 
other antigens of the subject invention. 
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EXA\fPr.F in 

PREPARATfON ^NP p ^ QT RfrATTnN OF M . TrmFPrrnn.r^ p,;.,,nK, pp^TPy^ ,^ 

A fusion protein containing TbRa3, the 38 kD antigen and Tb38-1 was 
5 prepared as foUows. 

Each of the DNA constructs TbRa3, 38 kD and Tb38-1 were modified 
by PCR in Older to facihtate their fusion and the subsequent expression of the fusion 
protein TbRa3-38 kD-Tb38-l. TbRaS, 38 kD and Tb38-1 DNA was used to perform 
PCR using the primer. PDM-64 and PDM-65 (SEQ ID NO: 141 and 142), PDM-57 and 
10 PDM-58 (SEQ ID NO: 143 and 144), and PDM-69 and PDM-60 (SEQ ID NO: 145- 
146), respectively. In each case, the DNA amplification was performed using 10 m 
lOX Pfu buffer. 2 m 10 mM dKITs. 2 ^ each of the PCR primers at 10 
concentration, 81.5 ^ water, 1.5 ^ Pfu DNA polymerase (Stratagene, La Jolla, CA) 
and 1 m DNA at either 70 ng/^l (for TbRa3) or 50 ng/^ (for 38 kD and Tb38-1). For 
5 TbRa3, denaniration at 94°C was performed for 2 mm, followed by 40 cycles of 96»C 
for 15 sec and 72-'C for 1 min, and lasUy by 72<'C for 4 mm. For 38 kD, denaturation at 
96''C was perfomied for 2 min. followed by 40 cycles of 96''C for 30 sec, 68-C for 15 
sec and 72»C for 3 min. and finally by 72'C for 4 min. For TO8-1 denamration at 94" 
C for 2 mm was followed by 10 cycles of 96°C for 15 sec. 68T for 15 sec and 72«'C for 
) 1.5 min, 30 cycles of 96»C for 15 sec, 64'C for 15 sec and 720C for 1.5. and finally bv 
72°C for 4 mm. 

The TbRa3 PCR fiagment was digested with Ndel and EcoRI and cloned 
directly mto pT7^L2 IL I vector usmg Ndel and EcoRI sues. The 38 kD PCR fragmem 
was digested with Sse8387I, treated with T4 DNA polymerase to make blunt ends and 
then digested with EcoRI for direct clomng mto the pT7^L2Ra3-l vector which was 
digested with Stui and EcoRI. TTie 38-1 PCR fragment was digested with Eco47m and 
EcoRI and directly subcloned mto pT7'^L2Ra3/38kD-I7 digested with the same 
enzymes. The whole fusion was then transferred to pET28b usmg Ndel and EcoRI 
sites. The fiision construct was confirmed by DNA sequencmg. 
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The expression construct was transformed to BLR pLys S E. coli 
(Novagen. Madison, WI) and grown overnight in LB broth with kanamycin (30 ug/ml) 
and chJoramphemcol (34 ag/mJ). This culture (12 ml) was used to inoculate 500 ml 
2XYT with the same antibiotics and the culture was induced with IPTG at an OD560 of 
5 0.44 to a final concentration of 1.2 mM. Four hours post-induction, the bacteria were 
h^ested and sonicated in 20 mM Tns (8.0), 100 mM NaCl, 0.1% DOC, 20 ,xg/ml 
Leupeptm, 20 mM PMSF foUowed by centrifiigation at 26,000 X g. TTie resultmg 
pellet was resuspended in 8 M urea, 20 mM Tris (8.0), 100 mM NaCl and bomid to Pro- 
bond nickel resin (Invitrogen, Carlsbad, CA). TTie column was washed several times 
) with the above buffer then eluted with an imidazole gradient (50 mM. 100 mM, 500 
mM mudazole was added to 8 M urea, 20 mM Tris (8.0), 100 mM NaCl). The eluates 
contammg the protem of interest were then dialzyed against 10 mM Tris (8.0). 

The DNA and amino acid sequences for the resultmg fusion protein 
(heremafter referred to as TbRa3-38 kD-Tb38-l) are provided m SEQ ID NO: 147 and 
148, respectively. 

A fusion protein containing the two antigens TbH-9 and Tb38-1 
(heremafter referred to as TbH9-Tb38-l) without a hinge sequence, was prepared using 
a smular procedure to that described above. The DNA sequence for the TbH9-Tb38-l 
fusion protein is provided in SEQ ID NO: 151. 

A fusion protein containing TbRa3. the antigen 38kD, Tb38- 1 and DPEP 
was prepared as follows. 

Each of the DNA constructs TbRa3, 38 kD and Tb38-1 were modified 
by PCR and cloned mto vectors essentially as described above, with the pnmers PDM- 
69 (SEQ ID NO:I45 and PDM-83 (SEQ ID NO: 200) being used for amplificauon of 
the TT,38-1A fragment. Tb3 8-1 A differs from Tb38-1 by a Dral site at the 3' end of the 
codmg region that keeps the final ammo acid intact while creating a blunt restnction site 
that .s ,n frame. The TbRa3/38kD/TT,38-lA fusion was then transferred to pET28b 
usmg Ndel and EcoRl sites. 

DPEP DNA was used to perform PCR using the primers PDM-84 and 
PDM-85 (SEQ ID NO: 201 and 202. respectively) and 1 ^ DNA at 50 ng/m. 
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Denaturation at 94 °C was performed for 2 min, followed by 10 cycles of 96 T for 15 
sec. 68 »C for 15 sec and 72 "C for 1.5 min; 30 cycles of 96 'C for 15 sec, 64 X for 15 
sec and 72 X for 1.5 min; and finally by 72 'C for 4 min. The DPEP PGR fragment 
was digested with EcoRI and Eco72I and clones directly into the pET28Ra3/38kD/38- 
5 lA construct which was digested with Dral and EcoRI. The fusion construct was 
confinned to be corr«:t by DNA sequencmg. Recombinant protein was prepared as 
described above. THe DNA and amino acid sequences for the resulting fusion protem 
(heremafter referred to as TbF-2) ar. provided in SEQ ID NO: 203 and 204. 
respectively. 

^0 A fusion protein containing TbRa3, the antigen 38kD. Tb38-1 and TbH4 

was prepared as follows. 

Genomic M. tuberculosis DNA was used to PGR fiUl-length TbH4 (FL 
TT>H4) with the primers PDM-157 and PDM-160 (SEQ ID NO: 343 and 344 
respectively) and 2 ^ DNA at 100 ng/^l. Denaturation at 96 "C was performed for ^ 

5 mm, followed by 40 cycles of 96 X for 30 sec, 61 X for 20 sec and 72 "C for 5 mm- 
and finally by amiealing at 72 X for 10 mm. TTie FL TbH4 PGR fragment was digested 
with EcoRI and Sea I (New England Biolabs.) and cloned directly into the 
pET28Ra3/381cD/38-IA constnict described above which was digested with Dral and 
EcoRI. TTie fusion construct was confirmed to be coirect by DNA sequencing 

. Recombmant protem was prepared as described above. The DNA and ammo acici 
sequences for the resulting fusion protem (heremafter referred to as TbF-6) are provided 
in SEQ ID NO: 345 and 346, respectively. 

A fusion protem contammg the antigen 38kD and DPEP separated by a 
linker was prepared as follows. 

38 kD DNA was used to perform PGR using the primers PDM-176 and 
PDM-175 (SEQ ID NO: 347 and 348, respectively), and 1 ,1 PET28Ra3/38kD/38- 
l/Ra2A-12 DNA at ,10 ng/m Denaturation at 96 X was perfomied for 2 min 
followed by 40 cycles of 96 X for 30 sec, 71 X for 15 sec and 72 "C for 5 mm and 40 
sec; and finally by amiealing at 72 X for 4 mm. Th. two sets of prmiers PDM-171 
PDM-172. and PDM-,73, PDM-I74 were amiealed by heating to 95 "C for 2 mm and 



wo 99/42118 



PCTAJS99/03265 



68 



then lamping down to 25 "C slowly at 0.1 »C/sec. DPEP DNA was used to perfonn 
PGR as described above. The 38 kD fragment was digested with Eco RI (New England 
Biolabs) and cloned into a modified pT7AL2 vector which was cut with Eco 72 I 
(Promega) and Eco RI. n,e modified pT7AL2 construct was designed to have a 
MGHHHHHH amino acid coding region in frame just 5' of the Eco 72 I site. The 
construct was digested with Kpn 21 (Gibco. BRL) and Pst I (New England Biolabs) and 
the amiealed sets of phosphorylated primers (PDM-171, PDM-I72 and PDM-173, 
PDM-174) werr cloned in. The DPEP PGR fragment was digested with Eco RI and 
Eco 72 1 and cloned into this second construct which was digested with Eco 47 m (New 
England Biolabs) and Eco RI. Ligations were done with a ligation kit from Panvera 
(Madison, WI). The resulting construct was digested widi Ndel (New England Biolabs) 

and Eco RI, and transferred to a modified pET28 vector. The fusion construct was 

confirmed to be correct by DNA sequencing. 

Recombinant protein was prepared essentially as described above. The 

DNA and ammo acid sequences for the resulting fiision protem (hereinafter referred to 

as TbF-8) arc provided m SEQ ID NO: 349 and 350. respectively. 



EXAMPT.FII 
Use of ¥ TUPFRnrrorrVv^iintj PT>^TFf>'^ F"n 
SEROPIArTNOSf.s DF TTTT^FBnfT rtfj]^ 

The effectiveness of the fusion protem TbRa3-38 kD-Tb38-I, prepared 
as descnbed above, in the serodiagnosis of tuberculosis mfection was examined by 
EUSA. 

The ELISA protocol was as described above in Example 6, with the 
fusion protein bemg coated at 200 ng/well. A panel of sera was chosen from a group of 
tuberculosis patients previously shown, either by ELISA or by western blot analysis, to 
react with each of the three antigens individually or m combination. Such a pane] 
enabled the dissection of the serological reactivity of the fusion protein to detenmne if 
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all three epitopes functioned with the fusion protein. As shown in Table 5. all four sera 
that reacted with TbRa3 only were detectable with the fusion protein. Three sera that 
reacted only with Tb38-l were also detectable, as were two sear that reacted with 38 kD 
alone. The remaining 15 sera were aU positive with the fusion protein based on a cut- 
ofFin the assay of mean negatives +3 standard deviations. This data demonstrates the 
functional activity of all three epitopes in die fusion protein. 



10 



Tables 
PATTFm-s: 



Serum ID 


Stanis 


ELISA and/or Western 
Blot Reactivity with 
Individual proteins 

38kd Tb38-I TbRa3 


Fusion 
xvcwuuiouiant 
OD 450 


Fusion 
Recombinant 
Status 


01B93I-40 


TB 








0.413 




0IB93I-41 


TB 




-r 


+ 


0 392 




01B93I-29 


TB 








2.217 




01B93I-109 


TB 


-!- 


J- 


4* 


0.522 


4- 


01B93I-132 


TB 








0.937 




5004 


TB 








1 1.098 




15004 


TB 








2.077 




39004 


TB 








1.675 


-1- 


68004 


TB 1 - 




.4- 


2.388 


-r 


99004 


TB 1 . " 


-i- 


U- 


0,607 


-i- 


107004 


TB 1 






0.667 


-l- 


92004 1 TB 1 - 




X 


1.070 




97004 1 TB 


-f 






1.152 


4. 


118004 


TB 


-r 






2.694 


-h 


173004 


TB 




4. 




3.258 


-h 


175004 


TB 


+ 




4- 


2.514 


-h 


274004 


TB 






+ 


3.220 


-i- 


276004 


TB 








2.991 


+ 


282004 


TB 








0.824 




289004 1 TB 








0.848 
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308004 


TB 


_ 






3.338 


+ 


314004 


TB 








1.362 




317004 


TB 








0.763 


4- 


312004 


TB 


- 




+ 


1.079 


4- 


D176 


PPD 


_ 






0.145 




D162 


PPD 








0.073 




D161 


PPD 








0.097 




D27 


PPD 








0.082 




A6-124 


NORMAL 








0.053 




A6-125 


NORMAL 








0.087 




A6-I26 


NORMAL 








0.346 1 i 


A6-127 1 NORMAL i - | . 


- 1 0.064 1 


A6-128 


NORMAL 








0.034 




A6-129 


NORMAL 








0.037 




A6-130 


NORMAL 








0.057 




A6-131 


NORMAL 








0.054 




A6-132 


NORMAL 








0.022 




A6-I33 


NORMAL 








0.147 




A6-134 


NORMAL 








0.101 




A6-135 


NORMAL 






0.066 




A6-136 ! NORMAL 1 - ! . | 


0.054 1 


: A6-137 1 NORMAL 1 - | . 1 


0.065 




A6-138 


NORMAL i 






0.041 




A6-139 1 NORMAL 






0.103 




A6-140 1 NORMAL 




- 1 0.212 




A6-141 


NORMAL 1 - 1 , 




0.056 




A6-142 1 NORMAL 1 - | . 




0.051 


1 
1 



The reactivity of the fusion protein TbF-2 with sera from M 
tubercuiosis-mftcied patients was examined by ELISA using the protocol descnbed 
above. The results of these studies (Table 6) demonstrate that all four antigens function 
independently in the fusion protem. 
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Table 6 



REACnvITY OF TBF.2 FUSION PROTEIN WITH TB AND NORMAL SERA 
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One of skiU in the art will appreciate that the order of the individual 
antigens within the fusion protein may be changed and that comparable act:vity would 
be expected provided each of the epitopes is still functionally available. In addition, 
truncated forms of the protems containing active epitopes may be used m the 

5 construction of fusion proteins. 



From the foregoing, i, will be appreciated that, although speciflc 
emb^taents of the invettf on have been de^bed herein for the purpose of iltott^on. 
vanous modifications may be mad. without deviating Som the spirit and scope of the 



10 invention, 
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CLAIMS 

We claim; 

1. A polypepride comprising an antigenic portion of a soluble 
M. tuberculosis antigen, or a variant of said antigen that differs only a, conservative 
substitutions and/or modifications, wherein said antigen has an N-tenninal sequence selected 
from the group consisting of: 

(a) Asp-Pro-Val-Asp-AIa-Val-ne-Asn-Thr-Thr-Cys-Asn-Tyr-GIy-Gln- 
Val-Val-Ala-Ala-Leu (SEQ ID NO: 1 15); 

(b) Ala-Val-GIu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro-Ser 
(SEQ ID NO: 116); 

(c) Ala-AJa-Met-Lys-Pro-Arg-Thr-Gly-.Asp-Gly-Pro-Leu-Glu-Ala-Ala- 
Lys-Glu-GIy-Arg (SEQ ID NO: 1 7); 

(d) Tyr-Tyr-Tip-Cys-Pro-Gly-Gln-Pio-Phe-Asp-Pro-Ala-Trp-Gly-Pro 
(SEQ ID NO: 118); 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-.\la-Val (SEQ ID 
NO: 119); 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-GIu-Xaa-De-Val-Pro (SEQ ID 
NO: 120); 

(g) '^-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser-Pro-Pro- 
Ser(SEQIDNO: 121); 

(h) -^a-Pro-Lys-TTir-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr-Gly 
(SEQ ID NO: 122); 

(1) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-nff-Aia-Ala-Gln-Leu-Thr-Ser- 

Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-.Asn-Val-Ser-Phe-Ala-Asn (SEQ 
ID NO: 123); and 

0) Ala-Pro-Glu-Ser-Gly-Ala-GIy-Leu-Gly-Gly-Thr-Val-Gln-Ala-Gly: 
(SEQ ID NO: 131) 
wherein Xaa may be any amino acid. 
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2. A polypeptide comprising an immunogenic portion of an 
M. tuberculosis antigen, or a variant of said antigen that differs only in consen^ative 
substimtions and/or modifications, wherem said antigen has an N-tennmal sequence selected 
from the groiq) consisting of: 

(a) A5p-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr-Tyr- 
Pn>-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID NO: 124) and 

(b) Xaa-Tyr-De-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ue-Val-Pro-Gly-Lys-Ile- 
Asn-Val-His-Leu-Val; (SEQ ID NO: 132), wherein Xaa may be any 
amino acid. 



3. A polypeptide comprising an antigenic portion of a soluble 
M. tuberculosis antigen, or a variant of said antigen that differs only m conservative 
substitutions and/or modifications, wherein said antigen comprises an ammo acid sequence 
encoded by a DNA sequence selected from the group consisting of the sequences recited m 
SEQ ID NOS: 1, 2, 4-10, 13-25, 52, 94 and 96, the complements of said sequences, and DNA 
sequences dut hybridize to a sequence recited m SEQ ID NOS: 1, 2, 4-10. 13-25, 52, 94 and 
96 or a complement thereof under moderately stringent conditions. 

A polypeptide comprising an anugenic ponion of a .V/. tuberculosis 
antigen, or a variant of said antigen that differs only in conservative substitutions and/or 
modifications, wherem said antigen comprises an amino acid sequence encoded by a DNA 
sequence selected from the group consistmg of the sequences recited m SEQ ID NOS" ^6-51 
133. 134. 158-178. 196. 235, 237-242. 248-251, 290-293, 304. 311. 313-315. 317, 319. 323^ 
324, 328. 330, 332. 334 and 336, the complements of said sequences, and DNA sequences 
that hybridize to a sequence recited m SEQ ID NOS: 26-51, 133, 134, 158-178 196 ^35 
237-242, 248-251, 290-293, 304, 311. 313-315, 317. 319, 323, 324. 328, 330, 332. 334 and 
336. or a complement thereof under moderately stringent conditions. 

^- A DNA molecule compnsing a nucleotide sequence encoding a 
polypeptide according to any one of claims 1-4. 
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6. A recombinant expression vector comprising a DNA molecule 
according to claim 5. 

7. A host cell transformed with an expression vector according to claim 6. 

8. The host ceU of claim 7 wherein the host cell is selected from the group 
consisting of £. coli, yeast and mammalian cells. 

9. A method for detecting M tuberculosis infection m a biological 
sample, comprising: 

(a) contacting a biological sample with one or more polypeptides 
according to any of claims M; and 

(b) detecting in the sample the presence of antibodies that bind to at least 
one of die polypeptides, thereby detecting M. tuberculosis mfection in the biological sample. 

10. A method for detecting M. tuberculosis infection in a biological 
sample, comprising: 

(a) contacting a biological sample with a polypeptide having an N- 
terminal sequence selected from the group consisting of sequences provided m SEQ ID NO: 
129 and 130; and 

(b) detecting m the sample the presence of antibodies that bmd to at least 
one of the polypeptides, thereby detectmg M tuberculosis infection m the biological sample. 

11. A method for detectmg M. tuberculosis infection in a biological 
sample, comprising: 

(a) contactmg a biological sample with one or more polypeptides encoded 
by a DNA sequence selected from the group consisting of SEQ ID NOS: 3, 11. 12. 135, 136. 
151-155, 184-188, 194-195. 198, 210-220, 232, 234, 256-271, 287, 288, 298-303, 305-310. 
312. 316, 318. 320-322. 325-327, 329. 331. 333, 335 and 337, the complements of said 



wo 99/42118 



76 



PCT/US99/03265 



sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID NOS: 3, 11, 
12, 135, 136, 151-155, 184-188, 194-195, 198, 210-220, 232, 234, 256-271, 287, 288, 298- 
303, 305-310, 312, 316, 318, 320-322, 325-327, 329, 331, 333, 335 and 337; and 

(b) detecting in the sample the presence of antibodies that bind to at least 
one of the polypeptides, thereby detecting M. tuberculosis infection in the biological sample. 

12. The method of any one of claims 9-11 wherein step (a) additionally 
comprises contacting the biological sample with a 38 kD M tuberculosis antigen and step (b) 
additionally comprises detecting in the sample the presence of antibodies that bind to the 
38 kD M tuberculosis antigen. 

13. The method of any one of claims 9-11 wherein the polypeptide(s) arc 
bound to a soUd support. 

14. The method of claim 13 wherein the sohd support comprises 
nitrocellulose, latex or a plastic material. 

15. The method of any one of claims 9-11 wherein the biological sample is 
selected from the group consistmg of whole blood, scrum, plasma, saliva, cerebrospmal tlmd 
and urine. 

16. The method of claim 15 wherem the biological sample is whole blood 

or serum. 

17. A method for delecting M. tuberculosis infection in a biological 
sample, comprising: 

(a) contactmg the sample with at least two oligonucleotide primers in a 
polymerase chain reaction, wherein at least one of the oUgonucleotide primers is specific for a 
DNA molecule according to claim 5; and 
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(b) detecting in the sample a DNA sequence that amplifies in the presence 
of the oligonucleotide primers, thereby detecting M. tuberculosis infection. 

18. The method of claim 17, wherein at least one of the oUgonucleotide 
primers comprises at least about 10 contiguous nucleotides of a DNA molecule according to 
claim 5. 

19- A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the sample with at least two oligonucleotide primers in a 
polymerase cham reaction, wherem at least one of the oligonucleotide pnmers is specific for a 
DNA sequence selected from the group consisting of SEQ ID NOS; 3, 11, 12, 135, 136, 151- 
155, 184-188, 194-195, 198, 210-220, 232, 234, 256-271, 287, 288, 298-303, 305-310, 312, 
316, 318, 320-322, 325-327, 329, 331, 333, 335 and 337; and 

(b) detecting in the sample a DNA sequence that amplifies in the presence 
of the first and second oligonucleotide primers, thereby detectmg M. tuberculosis infection. 

20. The method of claim 19, wherein at least one of the oligonucleotide 
pnmers compnses at least about 10 conuguous nucleotides of a DNA sequence selected from 
the group consistmg of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195, 198, 
210-220, 232, 234, 256-271, 287, 288, 298-303, 305-310, 312, 316, 318, 320-322, 325-327, 
329. 331, 333. 335 and 337. 

21. The method of claims 17 or 19 wherein the biological sample is 
selected from the group consistmg of whole blood, spumm, serum, plasma, saliva, 
cerebrospmal fluid and urme. 

22. A method for detecting M. tuberculosis infection in a biological 
sample, compnsmg: 



wo 99/42118 



78 



PCTAJS99/03265 



(a) contacting the sample with one or more oligonucleotide probes speciSc 
for a DNA molecule according to claim 5; and 

(b) detecting in the sample a DNA sequence that hybridizes to the 
oligonucleotide probe, thereby detecting M. tuberculosis infection. 

23. The mediod of claim 22 wherein the probe compnses at least about 15 
contiguous nucleotides of a DNA molecule according to claim 5. 

24. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the sample with one or more oligonucleotide probes specific 
for a DNA sequence selected from the group consisting of SEQ E) NOS: 3, 11, 12, 135, 136, 
151-155, 184-188, 194-195. 198, 210-220, 232, 234, 256-271, 287, 288, 298-303, 305-310,' 
312. 316, 318, 320-322, 325-327, 329, 331, 333, 335 and 337; and 

(b) detectmg in the sample a DNA sequence that hybridizes to the 
oligonucleotide probe, thereby detecting M. tuberculosis infection. 

25. The method of claim 24 wherem the oligonucleotide probe comprises 
at least about 15 contiguous nucleotides of a DNA sequence selected from the group 
consistmg of SEQ ID NOS: 3. 11, 12. 135. 136. 151-155. 184-188. 194-195, 198, 210-220. 
232. 234. 256-271, 287, 288. 298-303. 305-310. 312, 316. 318, 320-322, 325-327, 329, 331. 
333, 335 and 337. 

26. The method of claims 22 or 24 wherein the biological sample is 
selected from the group consistmg of whole blood, sputum, serum, plasma, saliva, 

cerebrospinal fluid and unne. 

27. A method for detecting M. tuberculosis infection in a biological 
sample, comprising: 
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(a) contacting the biological sample with a binding agent which is capable 
of binding to a polypeptide according to any one of claims M; and 

(b) detecting in the sample a protein or polypeptide that binds to the 
binding agent, thereby detecting M. tuberculosis infection in the biological sample. 

28. A method for detecting M. tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the biological sample with a binding agent which is capable 
of binding to a polypeptide having an N-tenninal sequence selected from the group consisting 
of sequences provided in SEQ ID NO: 129 and 130; and 

(b) detecting in the sample a protein or polypeptide that binds to the 
bmding agent, thereby detecting M. tuberculosis mfection m the biological sample. 

29. A method for detecting M. tuberculosis mfection in a biological 
sample, comprising: 

(a) contacting the biological sample with a binding agent which is capable 
of bmding to a polypeptide encoded by a DNA sequence selected from the group consistmg 
of SEQ ID N0S:3. II, 12. 135, 136, 151-155, 184-188, 194-195, 198. 210-220. 232, 234. 
256-271, 287. 288, 298-303. 305-310, 312, 316, 318, 320-322. 325-327. 329, 331. 333. 335 
and 337, the complements of said sequences, and DNA sequences that hybndize to a 
sequence recited in SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188. 194-195. 198, 210- 
220. 232. 234. 256-271, 287, 288. 298-303, 305-310, 312, 316, 318, 320-322. 325-327, 329, 
331. 333. 335 and 337; and 

(b) detectmg in the sample a protein or polypeptide that binds to the 
bmdmg agent, thereby detectmg M tuberculosis infection m the biological sample. 

30. The method of any one of claims 27-29 wherein the binding agent is a 
monoclonal antibody. 
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31. 

polyclonal antibody. 



The method of any one of claims 27-29 wherein the binding agent is a 



32. A diagnostic kit comprising: 



(a) 



one or more polypeptides according to any of claims 1-4; and 



(b) a detection reagent. 

33. A diagnostic kit comprising: 

(a) one or more polypeptides having an N-tenninal sequence selected from 
the group consisting of sequences provided in SEQ ID NO: 129 and 130; and 

(b) a detection reagent. 

34. A diagnostic kit comprising: 

(a) one or more polypeptides encoded by a DNA sequence selected from 
the group consistmg of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155. 184-188, 194-195, 198, 
210-220, 232, 234, 256-271, 287, 288, 298-303, 305-310, 312, 316, 318, 320-322, 325-327, 
329, 331, 333, 335 and 337, the complements of said sequences, and DNA sequences that 
hybndize to a sequence recited m SEQ ID NOS: 3. 1 1, 12, 135, 136, 151-155, 184-188 194- 
195. 198. 210-220, 232. 234, 256-271. 287. 288. 298-303. 305-310. 312, 316, 318, 320-322. 
325-327. 329. 33 1, 333, 335 and 337; and 

(b) a detection reagent. 

35. The kit of any one of clamis 32-34 wherein the poiypeptide(s) are 
immobilized on a solid support. 



j6. The kit of claim 35 wherein the solid support comprises nitrocellulose, 
latex or a plastic material. 



37. The kit of any one of claims 32-34 wherein the detection reagent 
comprises a repoiter group conjugated to a binding agent. 
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38. The kit of claim 37 wherein the binding agent is selected from the 
group consisting of anti-unmunoglobulins, Protein G. Protein A and lectins. 

39. The kit of claim 37 wherein the reporter group is selected from the 
group consisting of radioisotopes, fluorescent groups, lummescent g^ups. enzymes, biotm 
dye particles and colloidal particles. 

40. A diasnostic Id. comprising a, lc« two oUgomcleoud. primen at 
.east one of th. oUgonuctodde pri,n=rs b«ng s^Sc for a DNA n,ol.cde accot^„g ,o 

claim 5. ^ 

41. A diagnostic kit according ,o claim 40, wherein a. leas, one of the 
ohgonncleodde pnnre.^ conrpnses a, leas, abon. ,0 condgnon. nucleodde of a DNA 

molecule according to claim 5. 

42. A diagnosnc Id. comprising a a. leas, two oligonncleoude primers a. 
least one of ,ke pnmers being specific for a DNA sequence selected the group consisdng 
0. SEQ ID N0S:3. 11, u, 135. ,36. ,5,.,55, ,8^,88. ,9^195, 198. :iO...O ^3^ ^34 
^56-271. 287. 288. 298-303. 305-310. 3.2. 316, 318, 320-322, 325-327, 329. 3"3,' 333 "335 

and 337, 



4.. A diagnostic kit according to clami 42, wherem at least one of rhe 
ohgonucleotide pnmers comprises at least about 10 contiguous nucleotide of a DNA 
sequence selected from the group consistmgofSEQ ID NOS: 3, H. 12, 135, 136, 151-155 

318,320-322. 325-327.329,331,333,335 and 337. ' 

44. A diagnostic kit compnsmg at least one oligonucleotide probe the 
oligonucleotide probe bemg specific for a DNA molecule accordmg to claim 5, 
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45. A kit according to claim 44, wherein the oligonucleotide probe 
comprises at least about 15 contiguous nucleotides of a DNA molecule according to claim 5. 

46. A diagnostic kit comprising at least one oligonucleotide probe, the 
oligonucleotide probe being specific for a DNA sequence selected from the group consistmg 
of SEQ ID NOS; 3, II, 12, 135, 136, 151-155, 184-188, 194-195, 198, 210-220, 232, 234, 
256-271, 287, 288, 298-303, 305-310, 312, 316, 318, 320-322, 325-327, 329, 331, 333. 335 
and 337. 

47. A kit according to claim 46, wherein the oUgonucleotide probe 
comprises at least about 15 contiguous nucleotides of a DNA sequence selected from the 
group consisting ofSEQ ID NOS: 3, 11, 12, 135, 136. 151-155, 184-188, 194-195, 198, 210- 
220, 232, 234, 256-271, 287, 288, 298-303, 305-310, 312, 316, 318, 320-322, 325-327, 329, 
331, 333, 335 and 337. 



48. A monoclonal antibody that binds to a polypeptide according to any of 

claims 1-4. 



49. A polyclonal antibody that binds to a polypeptide according to any of 

claims 1-4. 



50. A fiision protem comprismg two or more polypeptides according to 
any one of claims 1-4. 



51. A fusion protein comprising one or more polypeptides according to 
any one of claims 1-4 and ESAT-6 (SEQ ED NO: 99). 

52. A fusion protem comprising a polypeptide having an N-terminal 
sequence selected from the group of sequences provided in SEQ ID NOS: 129 and 130. 
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53. A fusion protein comprising one or more polypeptides according to 
any one of claims 1-4 and theM tuberculosis antigen 38 kD (SEQ ID NO: 150). 

54. A diagnostic kit comprising: 

(a) one or more fusion proteins according to any one of claims 50-53; and 

(b) a detection reagent 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANTS: Reed, Steven G. 

SkeiJcy, Yasir A.W. 
Dillon, Davin C. 
Campos-Neto, Antonia 
Houghton, Raymond 
Vedvi ck, Thomas s. 
Twardzik, Daniel R. 
Lodes, Michael J. 
Hendr i ckson , Ronald 

(ix) TITLE OF IITVKNTION:^ COMPOUNDS AND METHODS FOR DIAGNOSIS OF 

TUBERCULOSIS 

(iii) NUMBER OF SEQUENCES: 350 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SEED and BERRY LLP 

(3) STREET: 6300 Columbia Center 70i Pi f ^^, » 

(C) CITY: Seactle ^s^cer, 701 Fifth Avenue 

(D) STATE: Washington 

(E) COUNTRY: USA 

(F) ZIP: 98104-7092 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: FloDpv disk 
(3) COMPLTTSR: IBM ?c" comoatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFT'.ARE: Patentln Release #i.o. Version #1.30 

(vi) CURRENT -APPLICATION DATA; 
(A) APPLICATION NUMBER: 
fB) FILING DATE: 05 -MAY- 1998 
(C) CLASSIFICATION: 

(Vila) ATTORNEY/AGENT INFORMATION- 
(A) NAME: Maki , David J. 
£B) REGISTRATION NUMBER- 31 392 
(C) REFERENCE/DOCKET NUMBER ! 210121. 417C9 

(ix) TELECOMMUNICATION INFORMATION- 

(A) TELEPHONE: (206) 622-4900 

(B) TELEFAX: (206) 682-6031 



2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 766 base nai 

(B) ^YPE: nucleic acid 
fC) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

CGAGGCACCG GTAGTTTGAA CCAAACGCAC AATCGACGGG CAAACGAACG GAAGAACACA 

ACCATGAAGA TGGTGAAATC GATCGCCGCA GGTCTGACCG CCGCGGCTGC AATCGGCGCC 

GCTGCGGCCG GTGTGACTTC GATCATGGCT GGCGGCCCGG TCGTATACCA GATGCAGCCG 

GTCGTCTTCG GCGCGCCACT GCCGTTGGAC CCGGCATCCG CCCCrTOACGT CCCGACCGCC 

GCCCAGTTGA CCAGCCTGCT CAACAGCCTC GCCGATCCCA ACGTGTCGTT TGCGAACAAG 

GGCAGTCTGG TCGAGGGCGG CATCGGGGGC ACCGAGGCGC GCATCGCCGA CCACAAGCTG 

AAGAAGGCCG CCGAGCACGG GGATCTGCCG CTGTCGTTCA GCGTGACGAA CATCCAGCCG 

GCGGCCGCCG GTTCGGCCAC CGCCGACGTT TCCGTCTCGG GTCCGAAGCT CTCGTCGCCG 

GTCACGCAGA ACGTCACGTT CGTGAATCAA GGCGGCTGGA TGCTGTCACG CGCATCGGCG 

ATGGAGTTGC TGCAGGCCGC AGGGNAACTG ATTGGCGGGC CGGNTTCAGC CCGCTGTTCA 

GCTACGCCGC CCGCCTGGTG ACGCGTCCAT GTCGAACACT CGCGCGTGTA GCACGGTGCG 

GTNTGCGCAG GGNCGCACGC ACCGCCCGGT GCAAGCCGTC CTCGAGATAG GTGGTGNCTC 

GNCACCAGNG ANCACCCCCN NNTCGNCNNT TCTCGNTGNT GNATGA 

(2) INFORMATION FOR SEQ ID NO : 2 : 

ii) SEQUENCE CHARACTERISTICS: 

(A) liENGTH: 752 base pairs 
(3) TYPE: nucleic acxd 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 

{XX) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

ATGCATCACC ATCACCATCA CGATGAAGTC ACGGTAGAGA CGACCTCCGT CTTCCGCGCA 
GACTTCCTCA GCGAGCTGGA CGCTCCTGCG CAAGCGGGTA CGGAGAGCGC GGTCTCCGGG 
GTGGAAGGGC TCCCGCCGGG CTCGGCGTTG CTGGTAGTCA AACGAGGCCC CAACGCCGGG 
TCCCGGTTCC TACTCGACCA AGCCATCACG TCGGCTGGTC GGCATCCCGA CAGCGACATA 
TTTCTCGACG ACGTGACCGT GAGCCGTCGC CATGCTGAAT TCCGGTTGGA AAACAACGAA 
TTCAATGTCG TCGATGTCGG GAGTCTCAAC GGCACCTACG TCAACCGCGA GCCCGTGGAT 
TC3GCGGTGC TGGCGAACGG CGACGAGGTC CAGATCGGCA AGCTCCGGTT GGTGTTCTTG 
ACCGGACCCA AGCAAGGCGA GGATGACGGG AGTACCGGGG GCCCGTGAGC GCACCCGATA 



60 
12 0 
180 
240 
300 
360 
420 
480 
540 
500 
660 
720 
766 
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GCCCCGCGCT GGCCGGGATG TCGATCGGGG CGGTCCTCCG ACCTGCTACG ACCGGAnTT 
CCCTGATGTC CACCATCTCC AAGATTCGAT TCTTGGGAGG CTTGAGGGTC NGGGTGACCC 
CCCCGCGGGC CTCATTCNGG GGmPCGGCN GGmCACCC CNTACCNACT GCCNCCCGGN 
TTGCNAATTC NTTCTTCNCT GCCCNNAAAG GGACCNTTAN CTTGCCGCTN GAAANGGTNA 
TCCNGGGCCC NTCCTNGAAN CCCCNTCCCC CT 
(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 813 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

CATATGCATC ACCATCACCA TCACACTTCT AACCGCCCAG CGCGTCGGGG GCGTCGAGCA 
CCACGCGACA CCGGGCCCGA TCGATCTGCT AGCTTGAGTC TGGTCAGGCA TCGTCGTCAG 
CAGCGCGATG CCCTATGTTT GTCGTCGACT CAGATATCGC GGCAATCCAA TCTCCCGCCT 
GCGGCCGGCG GTGCTGCAAA CTACTCCCGG AGGAATTTCG ACGTGCGCAT CAAGATCTTC 
ATGCTGGTCA CGGCTGTCGT TTTGCTCTGT TGTTCGGGTG TGGCCACGGC CGCGCCCAAG 
ACCTACTGCG AGGAGTTGAA AGGCACCGAT ACCGGCCAGG CGTGCCAGAT TCAAATGTCC 
GACCCGGCCT ACAACATCAA CATCAGCCTG CCCAGTTACT ACCCCGACCA GAAGTCGCTG 
GAAAATTACA TCGCCCAGAC GCGCGACAAG TTCCTCAGCG CGGCCACATC GTCCACTCCA 
CGCGAAGCCC CCTACGAATT GAATATCACC TCGGCCACAT ACCAGTCCGC GATACCGCCG 
CGTGGTACGC AGGCC3TGGT GCTCAMGGTC TACCACAACG CCGGCGGCAC GCACCCAACG 
ACCACGTACA AGGCCTTCGA TTGGGACCAG GCCTATCGCA AGCCAATCAC CTATGACACG 
CTGTGGCAGG CTGACACCGA TCCGCTGCCA GTCGTCTTCC CCATTGTTGC AAGGTGAACT 
GAGCAACGCA GACCGGGACA ACWGGTATCG ATAGCCGCCN AATGCCGGCT TGGAACCCNG 
TGAAATTATC ACAACTTCGC AGTCACNAAA NAA 
(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 447 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 
CGGTATGAAC ACGGCCGCGT CCGATAACTT CCAGCTGTCC CAGGGTGGGC AGGGATTCGC 
CAITCCGATC GGGCAGGCGA TGGCGATCGC GGGCCAGATC CGATCGGGTG GGGGGTCACC 
CACCGTTCAT ATCGGGCCTA CCGCCTTCCT CGGCTTGGGT GTTGTCGACA ACAACGGCAA 
CGGCGCACGA GTCCAACGCG TGGTCGGGAG CGCTCCGGCG GCAAGTCTCG GCATCTCCAC 
CGGCGACGTG ATCACCGCGG TCGACGGCGC TCCGATCAAC TCGGCCACCG CGATGGCGGA 
CGCGdTAAC GGGCATCATC CCGGTGACGT CATCTCGGTG AACTGGCAAA CCAAGTCGGG 
CGGCACGCGT ACAGGGAACG TGACATTGGC CGAGGGACCC CCGGCCTGAT TTCGTCGYGG 
ATACCACCCG CCGGCCGGCC AATTGGA 
(2). INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: S04 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUExVCE DESCRIPTION: SEQ ID N0:5: 
GTCCCACTGC GGTCGCCGAG TATGTCGCCC AGCAAATGTC TGGCAGCCGC CCAACGGAAT 
CCGGTGATCC GACGTCGCAG GTTGTCGAAC CCGCCGCCGC GGAAGTATCG GTCCATGCCT 
AGCCCGGCGA CGGCGAGCGC CGGAATGGCG CGAGTGAGGA GGCGGGCAAT TTGGCGGGGC 
CCGGCGACGG NGAGCGCCGG .^TGGCGCGA GTGAGGAGGT GGNCAGTCAT GCCCAGNGTG 
ATCCAATCAA CCTGNATTCG GNCTGNGGGN CCATTTGACA ATCGAGGTAG TGAGCGCAAA 
TGAATGATGG AAAACGGGNG GNGACGTCCG NTGTTCTGGT GGTGNTAGGT GNCTGNCTGG 
NGTNGNGGNT ATCAGGATGT TCTTCGNCGA AANCTGATGN CGAGGAACAG GGTGTNCCCG 
NNANNCCNAN GGNGTCCNAN CCCNNNNTCC TCGNCGANAT CANANAGNCG NTTGATGNGA 
NAAAAGGGTG GANCAGNNNN AANTNGNGGN CCNAANAANC NNNANNGNNG NNAGNTNGNT 
NNNTNTTNNC ANNNNNNNTG NNGNNGNNCN NNNCAANCNN NTTJNNNGNAA NNGGNTmT 
NAAT 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 633 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCS DESCRIPTION: SEQ ID NO: 6: 
TTGCANGTCG AACCACCTCA CTAAAGGGAA CAAAAGCTNG AGCTCCACCG CGGTGGCGGC 
CGCTCTAGAA CTAGTGKAT« YYYCKGGCTG CAGSAATYCG GYACGAGCAT TAGGACAGTC 
TAACGGTCCT GTTACGGTGA TCGAATGACC GACGACATCC TGCTGATCGA GACCGACGAA 
CGGGTGCGAA CCCTCACCCT CAACCGGCCG CAGTCCCGYA ACGCGCTCTC GGCGGCGCTA 
CGGGATCGGT TTTTCGCGGY GTTGGYCGAC GCCGAGGYCG ACGACGACAT CGACGTCGTC 
ATCCTCACCG GYGCC3ATCC GGTGTTCTGC GCCGGACTGG ACCTCAAGGT AGCTGGCCGG 
GCAGACCGCG CTGCCGGACA TCTCACCGCG GTGGGCGGCC ATGACCAAGC CGGTGATCGG 
CGCGATCAAC GGCGCCGCGG TCACCGGCGG GCTCGAACTG GCGCTGTACT GCGACATCCT 
GATCGCCTCC GAGCACGCCC GCTTCGNCGA CACCCACGCC CGGGTGGGGC TGCTGCCCAC 
CTGGGGACTC AGTGTGTGCT TGCCGCAAAA GGTCGGCATC GGNCTGGGCC GGTGGATGAG 
CCTGACCGGC GACTACCTGT CCGTGACCGA CGC 
(2) INFORMATION FOR SEQ ID HO:?: 

(i) SEQUENCS CHARACTERISTICS: 

(A) LENGTH: 1362 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCS DESCRIPTION: SEQ ID NO: 7: 
CGACGACGAC GGCGCCGGAG AGCGGGCGCG .^CGGCGATC GACGCGGCCC TGGCCAGAGT 
CGGCACCACC CAGGAGGGAG TCGAATCATG AAATTTGTCA ACCATATTGA GCCCGTCGCG 
CCCCGCCGAG CCGGCGGCGC GGTCGCCGAG GTCTATGCCG AGGCCCGCCG CGAGTTCGGC 
CGGCTGCCCG AGCCGCTCGC CATGCTGTCC CCGGACGAGG GACTGCTCAC CGCCGGCTGG 
GCGACGTTGC GCGAGACACT GCTGGTGGGC CAGGTGCCGC GTGGCCGCAA GGAAGCCGTC 
GCCGCCGCCG TCGCGGCCAG CCTGCGCTGC CCCTGGTGCG TCGACGCACA CACCACCATG 
CTGTACGCGG CAGGCCAAAC CGACACCGCC GCGGCGATCT TGGCCGGCAC AGCACCTGCC 
GCCGGTGACC CGAACGCGCC GTATGTGGCG TGGGCGGCAG GAACCGGGAC ACCGGCGGGA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
633 



60 
12 0 
180 
240 
300 
360 
420 
480 
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CCGCCGGCAC CGTTCGGCCC GGATGTCGCC GCCGAATACC TGGGCACCGC GGTGCAATTC 
CACTTCATCG CACGCCTGGT CCTGGTGCTG CTGGACGAAA CCTTCCTGCC GGGGGGCCCG 
CGC3CCCAAC AGCTCATGCG CCGCGCCGGT GGACTGGTGT TCGCCCGCAA GGTGCGCGCG 
GAGCATCGGC CGGGCCGCTC CACCCGCCGG CTCGAGCCGC GAACGCTGCC CGACGATCTG 
GCATGGGCAA CACCGTCCGA GCCCATAGCA ACCGCGITCG CCGCGCTCAG CCACCACGTG 
GACACCGCGC CGCACCTGCC GCCACCGACT CGTCAGGTGG TCAGGCGGGT CGTGGGGTCG 
TGGCACGGCG AGCCAATGCC GATGAGCAGT CGCTGGACGA ACGAGCACAC CGCCGAGCTG 
CCCGCCGACC TGCACGCGCC CACCCGTCTT GCCCTGCTGA CCGGCCTGGC CCCGCATCAG 
GTGACCGACG ACGACGTCGC CGCGGCCCGA TCCGTGCTCG ACACCGATGC GGCGCTGGTT 
GGCGCGCTGG CCTGGGCCGC CTTCACCGCC GCGCGGCGCA TCGGCACCTG GATCGGCGCC 
GCCGCCGAGG GCCAGGTGTC GCGGCAAAAC CCGACTGGGT GAGTGTGCGC GCCCTGTCGG 
TAGGGTGTCA TCGCTGGCCC GAGGGATCTC GCGGCGGCGA ACGGAGGTGG CGACACAGGT 
GGAAGCTGCG CCCACTGGCT TGCGCCCCAA CGCCGTCGTG GGCGTTCGGT TGGCCGCACT 
GGCCGATCAG GTCGGCGCCG GCCCTTGGCC GAAGGTCCAG CTCAACGTGC CGTCACCGAA 
GGACCGGACG GTCACCGGGG GTCACCCTGC GCGCCCAAGG AA 
(2) INFORMATION rOR SEQ ID NO: 8: 

(i) SEQUENCS CHARACTERISTICS: 

(A) LENGTH: 1458 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 
GCGACGACCC CGATATGCCG GGCACCGTAG CGAAAGCCGT CGCCGACGCA CTCGGGCGCG 
GTATCGCTCC CGTTGAGGAC AnCAGGACT GCGTGGAGGC CCGGCTGGGG GAAGCCGGTC 
TGGATGACGT GGCCCGTGTT TACATCATCT ACCGGCAGCG GCGCGCCGAG CTGCGGACGG 
CTAAGGCCrr GCTCGGCGTG CGGGACGAGT TAAAGCTGAG CTTGGCGGCC GTGACGGTAC 
TGCGCGAGCG CTATCTGCTG CACGACGAGC AGGGCCGGCC GGCCGAGTCG ACCGGCGAGC 

TGATGGACCG ATCGGCGCGC TGTGTCGCr^ rrrrrnT^r^r^. ^ 

i^i^.^QCGG CGGCCGAGGA CCAGTATGAG CCGGGCTCGT 

CGAGGCGGTG GGCCGAGCGG TTCGCCACGC TATTACGCAA CCTGGAATTC CTGCCGAATT 
CGCCCACGTT GATGAACTCT GGCACCGACC TGGGACTGCT CGCCGGCTGT TTTGTTCTGC 



S40 
600 
660 
720 
780 
340 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1362 
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180 
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CGATTGAGGA TTCGCTGCAA TCGATCTTTG CGACGCTGGG ACAGGCCGCC GAGCTGCAGC 
GGGCTGGAGG CGGCACCGGA TATGCGTTCA GCCACCTGCG ACCCGCCGGG GATCGGGTGG 
CCTCCACGGG CGGCACGGCC AGCGGACCGG TGTCGriTCT ACGGCTGTAT GACAGTGCCG 
CGGGTGTGGT CTCCATGGGC GGTCGCCGGC GTGGCGCCTG TATGGCTGTG CTTGATGTGT 
CGCACCCGGA TATCTGTGAT TTCGTCACCG CCAAGGCCGA ATCCCCCAGC GAGCTCCCGC 
ATTTCAACCT ATCGGTTGGT GTGACCGACG CGrrCCTGCG GGCCGTCGAA CGCAACGGCC 
TACACCGGCT GGTCAATCCG CGAACCGGCA AGATCGTCGC GCGGATGCCC GCCGCCGAGC 
TGTTCGACGC CATCTGCAAA GCCGCGCACG CCGGTGGCGA TCCCGGGCTG GTGTTrcrrCG 
ACACGATCAA TAGGGCAAAC CCGGTGCCGG GGAGAGGCCG CATCGAGGCG ACCAACCCGT 
GCGGGGAGGT CCCACTGCTG CCTTACGAGT CATGTAATCT CGGCTCGATC AACCTCGCCC 
GGATGCTCGC CGACGGTCGC GTCGACTGGG ACCGGCTCGA GGAGGTCGCC GGTGTGGCGG 
TGCGGTTCCT TGATGACGTC ATCGATGTCA GCCGCTACCC CTTCCCCGAA CTGGGTGAGG 
CGGCCCGCGC CACCCGCAAG ATCGGGCTGG GAGTCATGGG rTTGGCGGAA CTGCTTGCCG 
CACTGGGTAT TCCGTACGAC AGTGAAGAAG CCGTGCGGTT AGCCACCCGG CTCATGCGTC 
GCATACAGCA GGCGGCGCAC ACGGCATCGC GGAGGCTGGC CGAAGAGCGG GGCGCATTCC 
CGGCGTTCAC CGATAGCGGG TTCGCGCGGT CGGGCCCGAG GCGCAACGCA CAGGTCACCT 
CCGTCGCTCC QACOGGCA 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 862 base pairs 

(B) TYPE: nucleic acid 
(CJ STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

ACGGTGTAAT CGTGCTGGAT CTGGAACCGC GTGGCCCGCT ACCTACCGAG ATCTACTGGC 

GGC3CAGGGG GCTGGCCCTG GGCATCGCGG TCGTCGTAGT CGGGATCGCG GTGGCCATCG 

TCATCGCCrr CGTCGACAGC AGCGCCGGTG CCAAACCGGT CAGCGCCGAC AAGCCGGCCT 

CCGCCCAGAG CCATCCGGGC TCGCCGGCAC CCCAAGCACC CCAGCCGGCC GGGCAAACCG 

AAGGTAACGC CGCCGCGGCC CCGCCGCAGG GCCAAAACCC CGAGACACCC ACGCCCACCG 



540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1458 



60 
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180 
240 
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CCGCGGTGCA GCCGCCGCCG GTGCTCAAGG 


AAGGGGACGA TTGCCCCGAT 


TCGACGCTGG 


360 


CCGTCAAAGG TTTGACCAAC GCGCCGCAGT 


ACTACGTCGG 


CGACCAGCCG AAGTTCACCA 


420 


TGGTGGTCAC CAACATCGGC CTGGTGTCCT 


GTAAACGCGA 


CGTTGGGGCC 


GCGGTGTTGG 


480 


CCGCCTACGT TTACTCGCTG GACAACAAGC 


GGTTGTGGTC 


CAACCTGGAC 


TGCGCGCCCT 


540 


CGAATGAGAC GCTGGTCAAG ACGTTTTCGC 


CCGGTGAGCA 


GGTAACOACC 


GCGGTGACCT 


600 


GGACCGGGAT GGGATCGGCG CCGCGCTGCC 


CATTGCCGCG 


GCCGGCGATC 


GGGCCGGGCA 


660 


CCTACAATCT CGTGGTACAA CTGGGCAATC 


TGCGCTCGCT 


GCCGGTTCCG 


TTCATCCTGA 


720 


ATCAGCCGCC GCCGCCGCCG GGGCCGGTAC 


CCGCTCCGGG 


TCCAGCGCAG 


GCGCCTCCGC 


780 


CGGAGTCTCC CGCGCAAGGC GGATAATTAT 


TGATCGCTGA 


TGGTCGATTC 


CGCCAGCTGT 


840 


GACAACCCCT CGCCTCGTGC CG 








362 


(2) INFORMATION FOR SEQ ID NO: 10 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 622 base pai 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID 



TTGATCAGCA 


CCGGCAAGGC 


GTCACATGCC 


TCCCTGGGTG 


TGCAGGTGAC CAATGACAAA 


60 


GACACCCCGG 


GCGCCAAGAT 


CGTCGAAGTA 


GTGGCCGGTG 


GTGCTGCCGC GAACGCTGGA 


12 0 


GTGCCGAAGG 


GCGTCGTTGT 


CACCAAGGTC 


GACGACCGCC 


CGATCAACAG CGCGGACGCG 


180 


TTGGTTGCCG 


CCGTGCGGTC 


CAAAGCGCCG 


GGCGCCACGG 


TGGCGCTAAC CTTTCAGGAT 


240 


CCCTCGGGCG 


GTAGCCGCAC 


AGTGCAAGTC 


ACCCTCGGCA 


AGGCGGAGCA GTGATGAAGG 


300 


TCGCCGCGCA 


GTGTTCAAAG 


CTCGGATATA 


CGGTGGCACC 


CATGGAACAG CGTGCGGAGT 


360 


TGGTGGTTGG 


CCGGGCACTT 


GTCGTCGTCG 


TTGACGATCG 


CACGGCGCAC GGCGATGAAG 


420 


ACCACAGCGG 


GCCGCTTGTC 


ACCGAGCTGC 


TCACCGAGGC 


CGGGTTTGTT GTCGACGGCG 


480 


TGGTGGCGGT 


GTCGGCCGAC 


GAGGTCGAGA 


TCCGAAATGC 


GCTGAACACA GCGGTGATCG 


540 


GCGGGGTGGA 


CCTGGTGGTG 


TCGGTCGGCG 


GGACCGGNGT 


GACGNCTCGC GATGTCACCC 


600 


CGGAAGCCAC 


CCGNGACATT 


CT 






622 


(2) INFORMA 


TION FOR SEQ ID NO : 1 1 









( 1 ) SEQL^NCE CHARACTERISTICS : 
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(A) LENGTH: 1200 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



GGCGCAGCGG 


TAAGCCTGTT 


GGCCGCCGGC 


ACACTGGTGT 


TGACAGCATG 


CGGPrjnTfinp 


o U 


ACCAACAGCT 


CGTCGTCAGG 


CGCAGGCGGA 


ACGTCTGGGT 


CGGTGCACTG 


CGGcr^pn an 




AAGGAGCTCC 


ACTCCAGCGG 


CTCGACCGCA 


CAAGAAAATG 


CCATGGAGCA 




ion 


GCCTACGTGC 


GATCGTGCCC 


GGGCTACACG 


TTGGACTACA 


ACGCCAACGG 




A r\ 
^ 4 U 


GGGGTGACCC 


AGTTTCTCAA 


CAACGAAACC 


GATTTCGCCG 


GCTCGGATGT 




inn 
J 00 


CCGTCGACCG 


GTCAACCTGA 


CCGGTCGGCG 


GAGCGGTGCG 


GTTCCCCGGC 




J D U 


CCGACGGTGT 


TCGGCCCGAT 


CGCGATCACC 


TACAATATCA 


AGGGCGTGAG 




/I T n 


CTTGACGGAC 


CCACTACCGC 


CAAGATTTTC 


AACGGCACCA 


TCACCGTGTG 




4 8 0 


CAGATCCAAG 


CCCTCAACTC 


CGGCACCGAC 


CTGCCGCCAA 


CACCGATTAG 




540 


CGCAGCGACA 


AGTCCGGTAC 


GTCGGACAAC 


TTCCAGAAAT 


AC CTCGACGG 




6 00 


GGGGCGTGGG 


GCAAAGGCGC 


CAGCGAAACG 


TTCAGCGGGG 


GCGTCGGrnT 




660 


GGG AACAACG 


GAACGTCGOr 




ACGACCGACG 


GGTCGATCAC 


CTACAACGAG 


720 


TGGTCGTTTG 


CGGTGGGTAA 


GCAGTTGAAC 


ATGGCCCAGA 


TCATCACGTC 


GGCGGGTCCG 


780 


GATCCAGTGG 


CGATCACCAC 


CGAGTCGGTC 


GGTAAGACAA 


TCGCCGGGGC 


CAAGATCATG 


340 


GGACAAGGCA 


ACGACCTGGT 


ATTGGACACG 


TCGTCGTTCT 


ACAGACCCAC 


CCAGCCTGGC 


900 


TCTTACCCGA 


TCGTGCTGGC 


GACCTATGAG 


ATCGTCTGCT 


CGAAATACCC 


GGATGCGACG 


960 


ACCGGTACTG 


CGGTAAGGGC 


GTTTATGCAA 


GCCGCGATTG 


GTCCAGGCCA 


AGAAGGCCTG 


1020 


GACCAATACG 


GCTCCATTCC 


GTTGCCCAAA 


TCGTTCCAAG 


CAAAATTGGC 


GGCCGCGGTG 


1080 


AATGCTATTT 


CTTGACCTAG 


TGAAGGGAAT 


TCGACGGTGA 


GCGATGCCGT 


TCCGCAGGTA 


1140 


GGGTCGCAAT 


TTGGGCCGTA 


TCAGCTATTG 


CGGCTGCTGG 


GCCGAGGCGG 


GATGGGCGAG 


1200 



(2) INFORMATION FOR SEQ ID NO : 12 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 12 : 



GCAAGCAGCT GCAGGTCGTG CTGTTCGACG 


AACTGGGCAT 


GCCGAAGACC 


AAACGCACCA 


o u 


AGACCGGCTA CACCACGGAT GCCGACGCGC 


TGCAGTCGTT 


GTTCGACAAG 


ACCGGGCATC 


ion 


CGTTTCTGCA ACATCTGCTC GCCCACCGCG 


ACGTCACCCG 


GCTCAAGGTC 


ACCGTCGACG 


^ o u 


GGTTGCTCCA AGCGGTGGCC GCCGACGGCC 


GCATCCACAC 


CACGTTCAAC 


CAGACGATCG 


^ *t U 


CCGCGACCGG CCGGCTCTCC TCGACCGAAC 


CCAACCTGCA 


GAACATCCCG 


ATCCGCACCG 


3 0 0 


ACGCGGGCCG GCGGATCCGG GACGCGTTCG 


TGGTCGGGGA 


CGGTTACGCC 


GAGTTGATGA 


360 


CGGCCGACTA CAGCCAGATC GAGATGCGGA 


TCATGGGGCA 


CCTGTCCGGG 


GACGAGGGCC 


t <6 U 


TCATCGAGGC G7TCAACACC GGGGAGGACC 


TGTATTCGTT 


CGTCGCGTCC 


CGGGTGTTC3 


*« 0 u 


GTGTGCCCAT CGACGAGGTC ACCGGCGAGT 


TGCGGCGCCG 


GGTCAAGGCG 


ATGTCGTACG 




GGCTGGTTTA CGGGTTGAGC GCCTACGGCC 


TGTCGCAGCA 


GTTGAAAATC 


TCCACCGAGG 


fin 0 

D V U 


AAGCCAACGA GCAGATGGAC GCGTATTTCG 


CCCGATTCGG 


CGGGGTGCGC 


GACTACCTGC 


660 


GCGCCGTAGT CGAGCGGGCC CGCAAGGACG 


GCTACACCTC 


GACGGTGCTG 


GGCCGTCGCC 




GCTACCTGCC CGAGCTGGAC AGCAGCAACC 


GTCAAGTGCG 


GGAGGCCGCC 


GAGCGGnmr; 


Ton 


CGCTGAACGC GCGGATCCAG GGCAGCGCGG 


CCGACATCAT 


CAAGGTGGCC 


ATGAT C CAGG 


□ t u 


^'s-vjrt.v.-^^Liuv- \jt--^-AAC^3AG u\-ACAGCTGG 


CGTCGCGCAT 


GCTGCTGCAG 


GTCCACGACG 


900 


AGCrGCTGTT CGAAATCGCC CCCGGTGAAC 


GCGAGCGGGT 


CGAGGCCCTG 


GTGCGCGACA 


960 


AGATGGGCGG CGCTTACCCG CTCGACGTCC 


CGCTGGAGGT 


GTCGGTGGGC 


TACGGCCGCA 


1020 


GCTGGGACGC GGCGGCGCAC TGAGTGCCGA 


GCGTGCATCT 


GGGGCGGGAA 


TTCGGCGATT 


1080 


TTTCCGCCCT GAGTTCACGC TCGGCGCAAT 


CGGGACCGAG 


TTTGTCCAGC 


GTGTACCCGT 


1140 


CGAGTAGCCT CGTCA 








1155 


(2) INFORMATION FOR SEQ ID NO: 13: 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1771 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
ICGCCGTC TGGTGrrTGA ACGGTTTTAC CGGTCGGCAT CGGCACGGGC (^TTGCCGGGT 
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TCGGGCCTCG 


GGTTGGCGAT 


CGTCAAACAG 


GTGGTGCTCA ACCACGGCGG 


ATTGCTGCGC 


120 


ATCGAAGACA 


CCGACCCAGG 


CGGCCAGCCC 


V- L TtjrtjAAL GT 


CGATTTACGT 


GCTGCTCCCC 


ISO 


GGCCGTCGGA 


TGCCGATTCC 


GCAGCTTCCC 




CTGGCGCTCG 


GAGCACGGAC 


240 


ATCGAGAACT 


CTCGGGGTTC 


GGCGAACGTT 




AATCTCAGTC 


CACGCGCGCA 


300 


ACCTAGTTGT 


GCAGTTACTG 


TTGAAAGCCA 




AGTCCACGCA 


TGGCCAAGTT 


360 


GGCCCGAGTA 


GTGGGCCTAG 


TACAGGAAGA 




GACATGACGA 


ATCACCCACG 


420 


GTATTCGCCA 


CCGCCGCAGC 


AGCCGGGAAC 




(jCTuACjGGG C 


AGCAGCAAAC 


480 


GTACAGCCAG 


CAGTTCGACT 


GGCGTTACCC 




LLLt-wGCAGC 


CAACCCAGTA 


540 


CCGTCAACCC 


TACGAGGCGT 


TGGGTGGTAC 


u w o w U L. oViO i 


CTGATAC CTG 


GCGTGATTCC 


600 


GACCATGAC3 


CCCCCTCCTG 


GGATGGTTCG 




CGTGCAGGCA 


TGTTGGCCAT 


660 


CGGCGCGGTG 


ACGATAGCGG 


TGGTGTCCGC 




GGCGCGGCCG 


CATCCCTGGT 


720 


CGGGTTCAAC 


CGGGCACCCG 


CCGGCCCCAG 




GTGGCTGCCA 


GCGCGGCGCC 


780 


AAGCATCCCC 


GCAGCAAACA 


TGCCGCCGGG 


u 1 Uoo 1 v-LxAA 


CAGGTGGCGG 


CCAAGGTGGT 


840 


GCCCAGTGTC 


GTCATGTTGG 


AAACCGATCT 


*^j<aCj V, CGC lLAG 


TCGGAGGAGG 


GCTCCGGCAT 


900 


CATTCTGTCT 


GCCGAGGGGC 


TGATCTTGAC 


AAv. AAC UAC 


GTGATCGCGG 


CGGCCGCCAA 


960 


GCCTCCCCTG 


GGCAGTCCGC 


CGCCGAAAAC 


taAC EjCj r AAC C 


TTCTCTGACG 


GGCGGACCGC 


1020 


ACCCTTCACG 


GTGGTGGGGG 


CTGACCCCAC 


v-AGTGATATC 


GCCGTCGTCC 


GTGTTCAGGG 


1080 


CGTCTCCGGG 


CTCACCCCGA 


TCTCCCTGGG 


TTCCTCCTCG 


GACCTGAGGG 


TCGGTCZAGCC 


1140 


GGTGCTGGCG 


ATCGGGTCGC 


CG CTCGGTTT 


UUAGG G CAC C 


GTGACCACGG 


GGATCG7CAG 


1200 


CGCTCTCAAC 


CGTCCAGTGT 


CGACGACCGG 


HjAGGCCGGC 


AACCAGAACA 


CCGTGCTGGA 


1260 


CGC CATTCAG 


ACCGACGCCG 


CGATCAACCC 


U»jGTAACTCC 


GGGGGCGCGC 


TGGTGAACAT 


1320 


GAACGCTCAA 


CrCGTCGGAG 


TCAACTCGGC 


CATTGCCACG 


CTGGGCGCGG 


ACTCAGCCGA 


1380 


TGCGCAGAGC 


GGCTCGATCG 


GTCTCGGTTT 


TGCGATTCCA 


GTCGACCAGG 


CCAAGCGCAT 


1440 


CGCCGACGAG 


TTGATCAGCA 


CCGGCAAGGC 


GTCACATGCC 


TCCCTGGGTG 


TGCAGGTGAC 


1500 


CAATGACAAA 


GACACCCCGG 


GCGCCAAGAT 


CGTCGAAGTA 


GTGGCCGGTG 


GTGCTGCCGC 


1560 


GAACGCTGGA 


GTGCCGAAGG 


GCGTCGTTGT 


CACCAAGGTC 


GACGACCGCC 


CGATCAACAG 


1620 


CGCGGACGC3 


TTGGTTGCCG 


CCGTGCGGTC 


CAAAGCGCCG 


GGCGCCACGG 


TGGCGCTAAC 


1630 


CTTTCAGGAT 


CCCTCGGGCG 


GTAGCCGCAC 


AGTGCAAGTC 


ACCCTCGGCA 


AGGCGGAGCA 


1740 
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GTGATGAAGG TCGCCGCGCA GTGTTCAAAG C 
(2) INFORMATION FOR SEQ ID NO : 14 : 

(i) SEQUENCE CHAJEIACTERISTICS : 

(A) LENGTH: 1058 base pairs 
{B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 14 ; 
CTCCACCGCG GTGGCGGCCG CTCTAGAACT AGTGGATCCC CCGGGCTGCA GGAATTCGGC 
ACGAGGATCC GACGTCGCAG GTTGTCGAAC CCGCCGCCGC GGAAGTATCG GTCCATGCCT 
AGCCCGGCGA CGGCGAGCGC CGGAATGGCG CGAGTGAGGA GGCGGGCAAT TTGGCGGGGC 
CCGGCGACGG CGAGCGCCGG AATGGCGCGA GTGAGGAGGC GGGCAGTCAT GCCCAGCGTG 
ATCCAATCAA CCTGCATTCG GCCTGCGGGC CCATTTGACA ATCGAGGTAG TGAGCGCAAA 
TGAATGATGG AAAACGGGCG GTGACGTCCG CTGTTCTGGT GGTGCTAGGT GCCTGCCTGG 
CGTTGTGGCT ATCAGGATGT TCTTCGCCGA AACCTGATGC CGAGGAACAG GGTGTTCCCG 
TGAGCCCGAC GGCGTCCGAC CCCGCGCTCC TCGCCGAGAT CAGGCAGTCG CTTGATGCGA 
CAAAAGGGTT GACCAGCGTG CACGTAGCGG TCCGAACAAC CGGGAAAGTC GACAGCTTGC 
TGGGTATTAC CAGTGCCGAT 3TCGACGTCC GGGCCAATCC GCTCGCGGCA AAGGGCGTAT 
3CACCTACAA CGACGAGCAG GGTGTCCCGT TTCGGGTACA AGGCGACAAC ATCTCGGTGA 
AACTGTTCGA CGACTGGAGC AATCTCGGCT CGATTTCTGA ACTGTCAACT TCACGCGTGC 
TCGATCCTGC CGCTGGGGTG ACGCAGCTGC TGTCCGGTGT CACGAACCTC CAAGCGCAAG 
GTACCGAAGT GATAGACGGA ATTTCGACCA CCAAAATCAC CGGGACCATC CCCGCGAGCT 
CTGTCAAGAT GCTTGATCCT 3GCGCCAAGA GTGCAAGGCC GGCGACCGTG TGGATTGCCC 
AGGACGGCTC GCACCACCTC GTCCGAGCGA GCATCGACCT CGGATCCGGG TCGATTCAGC 
TCACGCAGTC GAAATGGAAC GAACCCGTCA ACGTCGACTA GGCCGAAGTT GCGTCGACGC 
GTTGNTCGAA ACGCCCTTGT GAACGGTGTC AACGGNAC 
(2) INFORMATION FOR SEQ ID N0:15: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 542 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS: single 



1771 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
560 
720 
780 
840 
900. 
960 
1020 
1058 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 15 : 



GAATTCGGCA 


CGAGAGGTGA 


TCGACATCAT 


CGGGACCAGC CCCACATCCT GGGAACAGGC 


60 


GGCGGCGGAG 


GCGGTCCAGC 


GGGCGCGGGA 


TAGCGTCGAT GACATCCGCG TG3CTCGGGT 


120 


CATTGAGCAG 


GACATGGCCG 


TGGACAGCGC 


CGGCAAGATC ACCTACCGCA TCAAGCTCGA 


180 


AGTGTCGTTC 


AAGATGAGGC 


CGGCGCAACC 


GCGCTAGCAC GGGCCGGCGA GCAAGACGCA 


240 


AAATCGCACG 


GTTTGCGGTT 


GATTCGTGCG 


ATTTTGTGTC TGCTCGCCGA GGCCTACCAG 


300 


GCGCGGCCCA 


GGTCCGCGTG 


CTGCCGTATC 


CAGGCGTGCA TCGCGATTCC GGCGGCCACG 


360 


CCGGAGTTAA 


TGCTTCGCGT 


CGACCCGAAC 


TGGGCGATCC GCCGGNGAGC TGATCGATGA 


420 


CCGTGGCCAG 


CCCGTCGATG 


CCCGAGTTGC 


CCGAGGAAAC GTGCTGCCAG GCCGGTAGGA 


480 


A.GCGTCCGTA 


GGCGGCGGTG 


CTGACCGGCT 


CTGCCTGCGC CCTCAGTGCG GCCAGCGAGC 


540 


GG 








542 



(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 913 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ing 1 e 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



CGGTGCCGCC 


CGCGCCTCCG 


TTGCCCCCAT 


TGCCGCCGTC 


GCCGATCAGC 


TGCGCATCGC 


60 


CACCATCACC 


GCCTTTGCCG 


CCGGCACCGC 


CGGTGGCGCC 


GGGGCCGCCG 


ATGCCACCGC 


120 


TTGACCCTGG 


CCGCCGGCGC 


CGCCATTGCC 


ATACAGCACC 


CCGCCGGGGG 


CACCGTTACC 


180 


GCCGTCGCCA 


CCGTCGCCGC 


CGCTGCCGTT 


TCAGGCCGGG 


GAGGCCGAAT 


GAACCGCCGC 


240 


CAAGCCCGCC 


GCCGGCACCG 


TTGCCGCCTT 


TTCCGCCCGC 


CCCGCCGGCG 


CCGCCAATTG 


300 


CCGAACAGCC 


AMGCACCGTT 


GCCGCCAGCC 


CCGCCGCCGT 


TAACGGCGCT 


GCCGGGCGCC 


360 


GCCGCCGGAC 


CCGCCATTAC 


CGCCGTTCCC 


GTTCGGTGCC 


CCGCCGTTAC 


CGGCGCCGCC 


420 


GTTTGCCGCC 


AATATTCGGC 


GGGCACCGCC 


AGACCCGCCG 


GGGCCACCAT 


TGCCGCCGGG 


480 


CACCGAAACA 


ACAGCCCAAC 


GGTGCCGCCG 


GCCCCGCCGT 


TTGCCGCCAT 


CACCGGCCAT 


540 


TCACCGCCAG 


CACCGCCGTT 


AATGTTTATG 


AACCCGGTAC 


CGCCAGCGCG 


GCCCCTATTG 


SCO 


CCGGGCGCCG 


GAGNGCGTGC 


CCGCCGGCGC 


CGCCAACGCC 


CAAAAGCCCG 


GGGTTGCCAC 


660 
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CGGCCCCGCC GGACCCACCG GTCCCGCCGA TCCCCCCGTT GCCGCCGGTG CCGCCGCCAT 
TGGTGCTGCT GAAGCCGTTA GCGCCGGTTC CGCSGGTTCC GGCGGTGGCG CCNTGGCCGC 
CGGCCCCGCC GTTGCCGTAC AGCCACCCCC CGGTGGCGCC GTTGCCGCCA TTGCCGCCAT 
TGCCGCCGTT GCCGCCATTG CCGCCGTTCC CGCCGCCACC GCCGGNTTGG CCGCCGGCGC 
CGCCGGCGGC CGC 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 1872 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

txi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GACTACGTTG GTGTAGAAAA ATCCTGCCGC CCGGACCCTT AAGGCTGGGA CAATTTCTGA 
TAGCTACCCC GACACAGGAG GTTACGGGAT GAGCAATTCG CGCCGCCGCT CACTCAGGTG 
GTCATGGTTG CTGAGCGTGC TGGCTGCCGT CGGGCTGGGC CTGGCCACGG CGCCGGCCCA 
GGCGGCCCCG CCGGCCTTGT CGCAGGACCG GTTCGCCGAC TTCCCCGCGC TGCCCCTCGA 
CCCGTCCGCG ATGGTCGCCC AAGTGGCGCC ACAGGTGGTC AACATCAACA CCAAACTGGG 
CTACAACAAC GCCGTGGGCG CCGGGACCGG CATCGTCATC GATCCCAACG GTGTCGTGCT 
GACCAACAAC CACGTGATCG CGGGCGCCAC CGACATCAAT GCGTTCAGCG TCGGCTCCGG 
CCAAACCTAC GGCGTCGATG TGGTCGGGTA TGACCGCACC CAGGATGTCG CGGTGCTGCA 
GCTGCGCGGT GCCGGTGGCC TGCCGTCGGC GGCGATCGGT GGCGGCGTCG CGGTTGGTGA 
GCCCGTCGTC GCGATGGGCA ACAGCGGTGG GCAGGGCGGA ACGCCCCGTG CGGTGCCTGG 
CAGGGTGGTC GCGCTCGGCC AAACCGTGCA GGCGTCGGAT TCGCTGACCG GTGCCGAAGA 
GACATTGAAC GGGTTGATCC AGTTCGATGC CGCAATCCAG CCCGGTGATT CGGGCGGGCC 
CGTCGTCAAC GGCCTAGGAC AGGTGGTCGG TATGAACACG GCCGCGTCCG ATAACTTCCA 
GCTGTCCCAG GGTGGGCAGG GATTCGCCAT TCCGATCGGG CAGGCGATGG CGATCGCGGG 
CCAAATCCGA TCGGGTGGGG GGTCACCCAC CGTTCATATC GGGCCTACCG CCTTCCTCGG 
CTTGGGTGTT GTCGACAACA ACGGCAACGG CGCACGAGTC CAACGCGTGG TCGGAAGCGC 
TCC3GCGGCA AGTCTCGGCA TCTCGACCGG CGACGTGATC ACCGCGGTCG ACGGCGCTCC 
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GATCAACTCG GCCACCGCGA TGGCGGACGC GCTTAACGGG CATCATCCCG GTCACCTTCAT 
CTCGGTGAAC TGGCAAACCA AGTCGGGCGG CACGCGTACA GGGAACGTGA CATTGGCCGA 
GGGACCCCCG GCCTGATTTG TCGCGGATAC CACCCGCCGG CCGGCCAATT GGATTGGCGC 
CAGCCGTGAT TGCCGCGTGA GCCCCCGAGT TCCGTCTCCC GTGCSCGTGG CArTGTGGAA 
GCAATGAACG AGGCAGAACA CAGCGTTCAG CACCCTCCCG TGCAGGGCAG TTACGTCGAA 
GGCGGTGTGG TCGAGCATCC GGATGCCAAG GACTTCGGCA GCGCCGCCGC CCTXJCCCGCC 
GATCCGACCT GGTTTAAGCA CGCCGTCTTC TACGAGGTGC TGGTCCGGGC GTTCTTCGAC 
GCCAGCGCGG ACGGTTCCGN CGATCTGCGT GGACTCATCG ATCGCCTCGA CTACCTGCAG 
TGGCTTGGCA TCGACTGCAT CTG7TGCCGC CGTTCCTACG ACTCACCGCT GCGCGACGGC 
GGTTACGACA TTCGCGACTT CTACAAGGTG CTGCCCGAAT TCGGCACCGT CGACGATTTC 
GTCGCCCTGG TCGACACCGC TCACCGGCGA GGTATCCGCA TCATCACCGA CCTGGTGATG 
AATCACACCT CGGAGTCGCA CCCCTGGTTT CAGGAGTCCC GCCGCGACCC AGACGGACCG 
TACGGTGACT ATTACGTGTG GAGCGACACC AGCGAGCGCT ACACCGACGC CCGGATCATC 
TTCGTCGACA CCGAAGAGTC GAACTGGTCA TTCGATCCTG TCCGCCGACA GTINCTACTG 
GCACCGATTC TT 

(2) INFORMATION FOR SEQ XD NO: 18: 

fi) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1482 base oairs 

(B) TYPE: nucleic acid* 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
CTTCGCCGAA ACCTGATGCC GAGGAACAGG GTGTTCCCGT GAGCCCGACG GCGTCCGACC 
CCGCGCTCCT CGCCGAGATC AGGCAGTCGC TTGATGCGAC AAAAGGGTTG ACCAGCGTGC 
ACGTAGCGGT CCGAACAACC GGGAAAGTCG ACAGCTTGCT GGGTATTACC AGTGCCGATG 
TCGACGTCCG GGCCAATCCG CTCGCGGCAA AGGGCGTATG CACCTACAAC GACGAGCAGG 
GTGTCCCGTT TCGGGTACAA GGCGACAACA TCTCGGTGAA ACTGTTCGAC GACTGGAGCA 
ATCTCGGCTC GATTTCTGAA CTGTCAACTT CACGCGTGCT CGATCCTGCC GCTGGGGTGA 
CGCAGCTGCT GTCCGGTGTC ACGAACCTCC AAGCGCAAGG TACCGAAGTG ATAGACGGAA 
TTTCGACCAC CAAAATCACC GGGACCATCC CCGCGAGCTC TGTCAAGATG ^ATCCTG 
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GCGCCAAGAG TGCAAGGCCG GCGACCGTGT GGATTGCCCA GGACGGCTCG CACCACCTCG 
rCCGAGCGAG CATCGACCTC GGATCCGGGT CGATTCAGCT CACGCAGTCG AAATGGAACG 
AACCCGTCAA CGTCGACTAG GCCGAAGriG CGTCGACGCG TTGCTCGAAA CGCCCITGTG 
AACGGTGTCA ACGGCACCCG AAAACTGACC CCCTGACGGC ATCTGAAAAT TGACCCCCTA 
GACCGGGCGG TTGGTGGTTA TTCTTCGGTG GTTCCGGCTG GTGGGACGCG GCCGAGGTCG 
CGGTCTTTGA GCCGGTAGCT GTCGCCTTrG AGGGCGACGA CTTCAGCATG GTGGACGAGG 
CGGTCGATCA T^CGGCAGC AACGACGTCG TCGCCGCCGA AAACCTCGCC CCACCGGCCG 
AAGGCCTTAT TGGACGTGAC GATCAAGCTG GCCCGCTCAT ACCGGGAGGA CACCAGCTGG 
.:UVGAAGAGGT TGGCGGCCTC GGGCTCAAAC GGAATGTAAC CGACTTCGTC AACCACCAGO 
AGCGGATAGC GGCCAAACCG GGTGAGTTCG GCGTAGATGC GCCCGGCGTG GTGAGCCTCG 
GCGAACCGTG CTACCCArtC GGCGGCGGTG GCGAACAGCA CCCGATGACC GGCCTGACAC 
GCGCGTATCG CCAGGCCGAC CGCAAGATGA GTCTTCCCGG TGCCAGGCGG GGCCCAAAAA 
CACGACGTTA TCGCGGGCGG TGATGAAATC CAGGGTGCCC AGATGTGCGA TGGTGTCGCG 
TTTGAGGCCA CGAGCATGCT CAAAGTCGAA CTCTrcCAAC GACTTCCGAA CCGGGAAGCG 
GGCGGCGCGG ATGCGGCCCT CACCACCATG GGACTCCCGG GCTGACACTT CCCGCTGCAG 
GCAGGCGGCC AGGTATTCTT CGTGGCTCOA GTTCTCGGCG CGGGCGCGAT CGGCCAGCCG 
GGACACTGAC TCACGCAGGG TGGGAGCTTT CAATGCTCTT GT 

:2) :nformation for seq id no:19: 



(i) SEQUENC3 CHARACTERISTICS: 

(A) LENGTH: 976 base oairs 
IB) TYPE: nucleic acid 
(C; STRANDEDIIESS : single 
!D) TOPOLOGY; linear 



(xi) SEQUENCE DESCRIPTION: SEQ IE NO : 19 : 
GAATTCGGCA CGAGCCGCCG ATAGCTTCTG OGCCGCGGCC GACCAGATGG CTCGAGGGTT 
CGTGCTCGGG GCCACCGCCG GGCGCACCAC CCTGACCGGT GAGGGCCTGC AACACGCCGA 
CGGTCACTCG TTGCTGCTGG ACGCCACCAA CCCGGCGGTG GTTGCCTACG ACCCGGCCTT 
CGCCTACGAA ATCGGCTACA TCGNGGAAAG CGGACTGGCC AGGATGTGCG GGGAGAACCC 
GGAGAACATC TTCTTCTACA TCACCGTCTA CAACGAGCCG TACGTGCAGC CGCCGGAGCC 
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GGAGAACTTC GATCCCC3AGG GCGTGCTGGG GGGTATCTAC CGNTATCACG CGGCCACCGA 
GCAACGCACC AACAAGGNGC AGATCCTGGC CTCCGGGGTA GCGATGCCCG CGGCGCTGCG 
GGCAGCACAG ATGCTGGCCG CCGAGTGGGA TGTCGCCGCC GACGTGTGGT CGGTGACCAG 
TTGGGGCGAG CTAAACCGCG ACGGGGTGGT CATCGAGACC GAGAAGCTCC GCCACCCCGA 
TCGGCCGGCG GGCGTGCCCT ACGTGACGAG AGCGCTGGAG AATGCTCGGG GCCCGGTGAT 
CGCGGTGTCG GACTGGATGC GCGCGGTCCC CGAGCAGATC CGACCGTGGG TGCCGGGCAC 
ATACCTCACG ITGGGCACCG ACGGGTTCGG TITTTCCGAC ACTCGGCCCG CCGGTCGTCG 
TTACTTCAAC ACCGACGCCG AATCCCAGGT TGGTCGCGGT TTrGGGAGGG GTTGGCCGGG 
TCGACGGGTG AATATCGACC CATTCGGTGC CGGTCGTGGG CCGCCCGCCC AGTTACCCGG 
ATTCGACGAA GGTGGGGGGT TGCGCCCGAN TAAGTT 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
ATCCCCCCGG GCTGOIGGAA TTCGGCACGA GAGACAAAAT TCCACGCGTT AATGCAGGAA 
CAGATTCATA ACGAATTCAC AGCGGCACAA CAATATGTCG CGATCGCGGT TTArrTCGAC 
AGCGAAGACC TGCCGCAGTT GGCGAAGCAT TTTTACAGCC AAGCGGTCGA GGAACGAAAC 
CATGCAATGA TGCTCGTGCA ACACCTGCTC GACCGCGACC TTCGTGTCGA AATTCCCGGC 
GTAGACACGG TGCGAAACCA GTTCGACAGA CCCCGCGAGG CACTGGCGCr GGCGCTCGAT 
CAGGAACGCA CAGTCACCGA CCAGGTCGGT CGGCTGACAG CGGTGGCCCG CGACGAGGGC 
GATTTCCTCG GCGAGCAGTT CATGCAGTGG TTCTTGCAGG AACAGATCGA AGAGGTGGCC 
TTGATGGCAA CCCTGGTGCG GGTTGCCGAT CGGGCCGGGG CCAACCTGTr CGAGCTAGAG 
AACTTCGTCG CACGTGAAGT GGATGTGGCG CCGGCCGCAT CAGGCGCCCC GCACGCTGCC 
GGGGGCCGCC TCTAGATCCC TGGGGGGGAT CAGCGAGTGG TCCCGTTCGC CCGCCCGTCT 
TCCAGCCAGG CCTTGGTGCG GCCGGGGTGG TGAGTACCAA TCCAGGCCAC CCCGACCTCC 
CGGNAAAAGT CGATGTCCTC GTACTCATCG ACGTTCCAGG AGTACACCGC CCGGCCCTGA 
GCTGCCGAGC GGTCAAC3AG TTGCGGATAT TCCTTTAACG CAGGCAGTGA GGGTCCCACG 
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GCGGTTGGCC CGACCGCCGT GGCCGCACTG CTGGTCAGGT ATCGGGGGGT CTTGGCGAGC 
AACAACGTCG GCAGGAGGGG TGGAGCCCGC CGGATCCGCA GACCGGGGGG GCGAAAACGA 
CATCAACACC GCACGGGATC GATCTGCGGA GGGGGGTGCG GGAATACCGA ACCGGTGTAG 
GAGCGCCAGC AGTTGTTTTT CCACCAGCGA AGCGTrTTCG GGTCATCGGN GGCNNTTAAG 
T 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
CGTGCCGACG AACGGAAGAA CACAACCATG AAGATGGTGA AATCGATCGC CGCAGGTCTG 
ACCGCCGCGG CTGCAATCGG CGCCGCTGCG GCCGGTGTGA CTTCGATCAT GGCTGGCGGN 
CCGGTCGTAT ACCAGATGCA GCCGGTCGTC TTCGGCGCGC CACTGCCGTT GGACCCGGNA 
TCCGCCCCTG ANGTCCCGAC CGCCGCCCAG TGGACCAGNC TGCTCAACAG NCTCGNCGAT 
CCCAACGTGT CGrTTGNGAA CAAGGGNAGT CTGGTCGAGG GNGGNATCGG NGGNANCGAG 
GGNGNGNATC 3NC3ANCACA A 
(2) INFORMATION ?0R 32Q ID NO: 22: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 373 base oairs 

(B) TYPE: nucleic acid 
'CI STRANDEDNESS : single 

!D) TOPOLOGY: linear 

xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
TCTTATCGGT TCCGGTTGGC GACGGGTTTT GGGNGCGGGT GGTPAACCCG CTCGGCCAGC 
CGATCGACGG GCGCGGAGAC GTCGACTCCG ATACTCGGCG CGCGCTGGAG CTCCAGGCGC 
CCTCGGTGGT GNACCGGCAA GGCGTGAAGG AGCCGTTGNA GACCGGGATC AAGGCGATTG 
ACGCGATGAC CCCGATCGGC CGCGGGCAGC GCCAGCTGAT CATCGGGGAC CGCAAGACCG 
GCAAAAACCG CCGTCTGTGT CGGACACCAT CCTCAAACCA GCGGGAAGAA CTGGGAGTCC 
GGTGGATCCC AAGAAGCAGG TGCGCTTGTG TATACGTTGG CCATCGGGCA AGAAGGGGAA 



840 
900 
960 
1020 
1021 



60 
12 0 
180 
240 
300 
321 



60 
120 
180 
240 
300 
360 



wo 99/42118 



PCT/US99/03265 



19 

CTTACCATCG CCG 

373 

(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 52 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 



GTGACGCCGT 


GATGGGATTC 


CTGGGCGGGG 


CCGGTCCGCT 


GGCGGTGGTG 


GATCAGCAAC 


60 


TGGTTACCCG 


GGTGCCGCAA 


GGCTGGTCGT 


TTGCTCAGGC 


AGCCGCTGTG 


CCGGTGGTGT 


120 


TCTTGACGGC 


CTGGTACGGG 


TTGGCCGATT 


TAGCCGAGAT 


CAAGGCGGGC 


GAATCGGTGC 


180 


TGATCCATGC 


CGGTACCGGC 


GGTGTGGGCA 


TGGCGGCTGT 


GCAGCTGGCT 


CGCCAGTGGG 


240 


GCGTGGAGGT 


TTTCGTCACC 


GCCAGCCGTG 


GNAAGTGGGA 


CACGCTGCGC 


GCCATNGNGT 


300 


TTGACGACGA 


NCCATATCGG 


NGATTCCCNC 


ACATNCGAAG 


TTCCGANGGA 


GA 


352 


(2) INFORMATION FOR SEQ ID NO: 24: 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 726 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 



GAAATCCGCG 


TTCATTCCGT 


TCGACCAGCG 


GCTGGCGATA 


ATCGACGAAG 


TGATCAAGCC 


60 


GCGGTTCGCG 


GCGCTCATGG 


GTCACAGCGA 


GTAATCAGCA 


AGTTCTCTGG 


TATATCGCAC 


120 


CTAGCGTCCA 


GTTGCTTGCC 


AGATCGCTTT 


CGTACCGTCA 


TCGCATGTAC 


CGGTTCGCGT 


180 


GCCGCACGCT 


CATGCTGGCG 


GCGTGCATCC 


TGGCCACGGG 


TGTGGCGGGT 


CTCGGGGTCG 


240 


GCGCGCAGTC 


CGCAGCCCAA 


ACCGCGCCGG 


TGCCCGACTA 


CTACTGGTGC 


CCGGGGCAGC 


300 


CTTTCGACCC 


CGCATGGGGG 


CCCAACTGGG 


ATCCCTACAC 


CTGCCATGAC 


GACTTCCACC 


360 


GCGACAGCGA 


CGGCCCCGAC 


CACAGCCGCG 


ACTACCCCGG 


ACCCATCCTC 


GAAGGTCCCG 


420 


TGCTTGACGA 


TCCCGGTGCT 


GCGCCGCCGC 


CCCCGGCTGC 


CGGTGGCGGC 


GCATAGCGCT 


480 


CGTTGACCGG 


GCCGCATCAG 


CGAATACGCG 


TATAAACCCG 


GGCGTGCCCC 


CGGCAAGCTA 


540 


CGACCCCCGG 


CGGGGCAGAT 


TTACGCTCCC 


GTGCCGATGG 


ATCGCGCCGT 


CCGATGACAG 


600 


AAAATAGGCG 


ACGGTTTTGG 


CAACCGCTTG 


GAGGACGCTT 


GAAGGGAACC 


TGTCATGAAC 


660 
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GGCGACAGCG CCTCCACCAT CGACATCGAC AAGGTTGTTA CCCGCACACC CGTTCGCCGG 
ATCGTG 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 580 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNKSS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
CGCGACGACG ACGAACGTCG GGCCCACCAC CGCCTATGCG TTGATGCAGG CGACCGGGAT 
GGTCGCCGAC CATATCCAAG CATGCTGGGT GCCCACTGAG CGACCTTTTG ACCAGCCGGG 
CTGCCCGATG GCGGCCCGGT GAAGTCATTG CGCCGGGGCT TGTGCACCTG ATGAACCCGA 
ATAGGGAACA ATAGGGGGGT GATTTGGCAG TTCAATGTCG GGTATGGCTG GAAATCCAAT 
GGCGGGGCAT GCTCGGCGCC GACCAGGCTC GCGCAGGCGG GCCAGCCCGA ATCTGGAGGG 
AGCACTCAAT GGCGGCGATG AAGCCCCGGA CCGGCGACGG TCCTTTGGAA GCAACTAAGG 
AGGGGCGCGG CATTGTGATG CGAGTACCAC TTGAGGGTGG CGGTCGCCTG GTCGTCGAGC 
TGACACCCGA CGAAGCCGCC GCACTGGGTG ACGAACTCAA AGGCGTTACT AGCTAAGACC 
AGCCCAACGG CGAATGGTCG GCGTTACGCG CACACCTTCC GGTAGATGTC CAGTGTCTGC 
TCGGCGATGT ATGCCCAGGA GAACTCTTGG ATACAGCGCT 
12) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 160 base pairs 

(B) TYPE: nucleic acid 

{ C J 3TRANDEDNESS : s ingl e 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

AACGGAGGCG CCGGGGGriT TGGCGGGGCC GGGGCGGTCG GCGGCAACGG CGGGGCCGGC 

GGTACCGCCG GGTTGTTCGG TGTCGGCGGG GCCGGTGGGG CCGGAGGCAA CGGCATCGCC 

GGTGTCACGG GTACGTCGGC CAGCACACCG GGTGGATCCG 

(2) INFORMATION FOR SEQ ID NO: 27: 

■i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 272 base oairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 
GACACCGATA CGATGGTGAT GTACGCCAAC GTTGTCGACA CGCTCGAGGC GTTCACGATC 
CAGCGCACAC CCGACGGCGT GACCATCGGC GATGCGGCCC CGTTCGCGGA GGCGGCTGCC 
AAGGCGATGG GAATCGACAA GCTGCGGGTA AnCATACCG GAATGGACCC CGTCGTCGCT 
GAACGCGAAC AGTGGGACGA CGGCAACAAC ACGTTGGCGT TGGCGCCCGG TGTCGTTGTC 
GCCTACGAGC GCAACGTACA GACCAACGCC CG 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 317 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
■ GCAGCCGGTG GTTCTCGGAC TATCTGCGCA CGGTGACGCA GCGCGACGTG CGCGAGCTGA 
AGCGGATCGA GCAGACGGAT CGCCTGCCGC GGTTCATGCG CTACCTGGCC GCTATCACCG 
CGCAGGAGCT GAACGTGGCC GAAGCGGCGC GGGTCATCGG GGTCGACGCG GGGACGATCC 
GTTCGGATCT GGCGTGGTTC GAGACGGTCT ATCTGGTACA TCGCCTGCCC GCCTGGTCGC 
GGAATCTGAC CGCGAAGATC AAGAAGCGGT CAAAGATCCA CGTCGTCGAC AGTGGCTTCG 
CGGCCTGGTT GCGCGGG 

(2) INFORMATION FOR SEQ ID NO: 29: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 182 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO:29r 
GATCGTGGAG CTGTCGATGA ACAGCGTTGC CGGACGCGCG GCGGCCAGCA CGTCGGTGTA 
GCAGCGCCGG ACCACCTCGC CGGTGGGCAG CATGGTGATG ACCACGTCGG CCTCGGCCAC 
CGCTTCGGGC GCGCTACGAA ACACCGCGAC ACCGTGCGCG GCGGCGCCGG ACGCCGCCGT 

GG 
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(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 08 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

GATCGCGAAG TTTGGTGAGC AGGTGGTCGA CGCGAAAGTC TGGGCGCCTG CGAAGCGGGT 60 

CGGCGTTCAC GAGGCGAAGA CACGCCTGTC CGAGCTGCTG CGGCTCGTCT ACGGCGGGCA 120 

GAGGTTGAGA TTGCCCGCCG CGGCGAGCCG GTAGCAAAGC TTGTGCCGCT GCATCCTCAT ISO 

GAGACTCGGC GGTTAGGCAT TGACCATGGC GTGTACCGCG TGCCCGACGA TTTGGACGCT 24 0 

CCGTTGTCAG ACGACGTGCT CGAACGCTTT CACCGGTGAA GCGCTACCTC ATCGACACCC 300 



ACGTTTGG 

(2) INFORMATION FOR SEQ ID NO: 31: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



308 



;x-) SEQUENCE DESCRIPTION: SEQ ID MO: 31: 

CCGACGACGA GCAACTCACG TGGATGATGG TCGGCAGCGG CATTGAGGAC GGAGAGAATC 50 

CGGCCGAAGC TGCCGCGCGG CAAGTGCTCA TAGTGACCGG CCGTAGAGGG CTCCCCCGAT 12 0 

GGCACCGGAC TATTCTGGTG TGCCGCTGGC CGGTAAGAGC GGGTAAAAGA ATGTGAGGGG 180 

ACACGATGAG CAATCACACC TACCGAGTGA TCGAGATCGT CGGGACCTCG CCCGACGGCG 24 0 

TCGACGCGGC AATCCAGGGC GGTCTGG 2 67 
(2) INFORMATION FOR SEQ ID NO : 32 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 153 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
CTCGTGCCGA AAGAATGTGA GGGGACACGA TGAGCAATCA CACCTACCGA GTGATCGAGA 6 0 

TCGTCGGGAC CTCGCCCGAC GGCGTCGACG CGGCAATCCA GGGCGGTCTG GCCCGAGCTG 12 0 
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CGCAGACCAT GCGCGCGCTG GACTGGTTCG AAGTACAGTC AATTCGAGGC CACCTGGTCG 180 

ACGGAGCGGT CGCGCACTTC CAGGTGACTA TGAAAGTCGG CTTCCGCTGG AGGATTCCTG 24 0 

AACCTTCAAG CGCGGCCGAT AACTGAGGTG CATCATTAAG CGACTTTTCC AGAACATCCT 3 00 

GACGCGCTCG AAACGCGGTT CAGCCGACGG TGGCTCCGCC GAGGCGCTGC CTCCAAAATC 360 

CCTGCGACAA TTCGTCGGCG GCGCCTACAA GGAAGTCGGT GCTGAATTCG TCGGGTATCT 420 

GGTCGACCTG TGTGGGCTGC AGCCGGACGA AGCGGTGCTC GACGTCGGCT GCGGCTCGGG 48 0 

GCGGATGGCG TTGCCGCTCA CCGGCTATCT GAACAGCGAG GGACGCTACG CCGGCTTCGA 54 0 

TATCTCGCAG AAAGCCATCG CGTGGTGCCA GGAGCACATC ACCTCGGCGC ACCCCAACTT 600 

CCAGTTCGAG GTCTCCGACA TCTACAACTC GCTGTACAAC CCGAAAGGGA AATACCAGTC 550 

ACTAGACTTT CGCTTTCCAT ATCCGGATGC GTCGTTCGAT GTGGTGTTTC TTACCTCGGT 72 0 

GTTCACCCAC ATGTTTCCGC CGGACGTGGA GCACTATCTG GACGAGATCT CCCGCGTGCT 78 0 

GAAGCCCGGC GGACGATGCC TGTGCACGTA CTTCTTGCTC AATGACGAGT CGTTAGCCCA B4 0 

CATCGCGGAA GGAAAGAGTG CGCACAACTT CCAGCATGAG GGACCGGGTT ATCGGACAAT 90 0 

CCACAAGAAG CGGCCCGAAG AAGCAATCGG CTTGCCGGAG ACCTTCGTCA GGGATGTCTA 960 

TGGCAAGTTC GGCCTCGCCG TGCACGAACC ATTGCACTAC GGCTCATGGA GTGGCCGGGA 1020 

ACCACGCCTA AGCTTCCAGG ACATCGTCAT CGCGACCAAA ACCGCGAGCT AGGTCGGCAT 10 80 

CC3GGAAGCA TCGCGACACC GTGGCGCCGA GCGCCGCTGC CGGCAGGCCG ATTAGGCGGG 114 0 

CAGATTAGCC CGCCGCGGCT CCCGGCTCCG AGTACGGCGC CCCGAATGGC GTCACCGGCT 120 0 

GGTAACCACG CTTGCGCGCC TGGGCGGCGG CCTGCCGGAT CAGGTGGTAG ATGCCGACAA 12 60 

AGCCTGCGTG ATCGGTCATC ACCAACGGTG ACAGCAGCCG GTTGTGCACC AGCGCGAACG 13 2 0 

CCACCCCGGT CTCCGGGTCT GTCCAGCCGA TCGAGCCGCC CAAGCCCACA TGACCAAACC 13 80. 

CCGGCATCAC GTTGCCGATC GGCATACCGT GATAGCCAAG ATGAAAATTT AAGGGCACCA 144 0 

ATAGATTTCG ATCCGGCAGA ACTTGCCGTC GGTTGCGGGT CAGGCCCGTG ACCAGCTCCC 150 0 

GCGACAAGAA CCGTATGCCG TCGATCTCGC CTCGTGCCG 153 9 
(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 351 base pairs 

(B) TYPE: nucleic acid 

( C ; STRANDEDNESS : s ingle 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 

CTGCAGGGTG GCGTGGATGA GCGTCACCGC GGGGCAGGCC GAGCTGACCG CCGCCCAGGT 60 

CCGGGTTGCT GCGGCGGCCT ACGAGACGGC GTATGGGCTG ACGGTGCCCC CGCCGGTGAT 120 

CGCCGAGAAC CGTGCTGAAC TGATGATTCT GATAGCGACC AACCTCTTGG GGCAAAACAC IBO 

CCCGGCGATC GCGGTCAACG AGGCCGAATA CGGCGAGATG TGGGCCCAAG ACGCCGCCGC 24 0 

GATGTTTGGC TACGCCGCGG CGACGGCGAC GGCGACGGCG ACGTTGCTGC CGTTCGAGGA 300 

GGCGCCGGAG ATGACCAGCG CGGGTGGGCT CCTCGAGCAG GCCGCCGCGG TCGAGGAGGC 360 

CTCC3ACACC GCCGCGGCGA ACCAGTTGAT GAACAATGTG CCCCAGGCGC TGAAACAGTT 4 20 

GGCCCAGCCC ACGCAGGGCA CCACGCCTTC TTCCAAGCTG GGTGGCCTGT GGAAGACGGT 480 

CTCGCCGCAT CGGTCGCCGA TCAGCAACAT GGTGTCGATG GCCAACAACC ACATGTCGAT 54 0 

GACCAACTCG GGTGTGTCGA TGACCAACAC CTTGAGCTCG ATGTTGAAGG GCTTTGCTCC 600 

GGCGGCGGCC GCCCAGGCCG TGCAAACCGC GGCGCAAAAC GGGGTCCGGG CGATGAGCTC 660 

GCTGGGCAGC TCGCTGGGTT CTTCGGGTCT GGGCGGTGGG GTGGCCGCCA ACTTGGGTCG 72 0 

GGCGGCCTCG GTACGGTATG GTCACCGGGA TGGCGGAAAA TATGCANAGT CTGGTCGGCG 730 

GAACGGTGGT CCGGCGTAAG GTTTACCCCC GTTTTCTGGA TGCGGTGAAC TTCGTCAACG 84 0 

GAAACAGTTA C B51 
'2) INFORMATION FOR SEQ ID NO: 34: 

(ij SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 254 base pairs 

(B) TYPE: nucleic acid 

(C) 3TRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

GATCGATCGG GCGGAAATTT GGACCAGATT CGCCTCCGGC GATAACCCAA TCAATCGAAC 50 

CTAGATTTAT TCCGTCCAGG GGCCCGAGTA ATGGCTCGCA GGAGAGGAAC CTTACTGCTG 12 0 

CGGGCACCTG TCGTAGGTCC TCGATACGGC GGAAGGCGTC GACATTTTCC ACCGACVCCC 19 0 

CCATCCAAAC GTTCGAGGGC CACTCCAGCT TGTGAGCGAG GCGACGCAGT CGCAGGCTGC 24 0 

GCTTGGTCAA GATC 254 
;2: INFORMATION FOR SEQ ID MO: 35: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1227 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

GATCCTGACC GAAGCGGCCG CCGCCAAGGC GAAGTCGCTG TTGGACCAGG AGGGACGGGA 60 

CGATCTGGCG CTGCGGATCG CGGTTCAGCC GGGGGGGTGC GCTGGATTGC GCTATAACCT 120 

TTTCTTCGAC GACCGGACGC TGGATGGTGA CCAAACCGCG GAGTTCGGTG GTGTCAGGTT 180 

GATCGTGGAC CGGATGAGCG CGCCGTATGT GGAAGGCGCG TCGATCGATT TCGTCGACAC 24 0 

TATTGAGAAG CAAGGTTCAC CATCGACAAT CCCAACGCCA CCGGCTCCTG CGCGTGCGGG 3 00 

GATTCGTTCA ACTGATAAAA CGCTAGTACG ACCCCGCGGT GCGCAACACG TACGAGCACA 360 

CCAAGACCTG ACCGCGCTGG AAAAGCAACT GAGCGATGCC TTGCACCTGA CCGCGTGGCG 42 0 

GGCCGCCGGC GGCAGGTGTC ACCTGCATGG TGAACAGCAC CTGGGCCTGA TATTGCGACC 480 

AGTACACGAT TTTGTCGATC GAGGTCACTT CGACCTGGGA GAACTGCTTG CGGAACGCGT 540 

CGCTGCTCAG CTTGGCCAAG GCCTGATCGG AGCGCTTGTC GCGCACGCCG TCGTGGATAC 600 

CGCACAGCGC ATTGCGAACG ATGGTGTCCA CATCGCGGTT CTCCAGCGCG TTGAGGTATC 660 

CCTGAATCGC GGTTTTGGCC GGTCCCTCCG AGAATGTGCC TGCCGTGTTG GCTCCGTTGG 72 0 

TGCGGACCCC GTATATGATC GCCGCCGTCA TAGCCGACAC CAGCGCGAGG GCTACCACAA 78 0 

TGCCGATCAG CAGCCGCTTG TGCCGTCGCT TCGGGTAGGA CACCTGCGGC GGCACGCCGG 34 0 

GA7ATGCGGC GGGCGGCAGC GCCGCGTCGT CTGCCGGTCC CGGGGCGAAG GCCGGTTCGG 90 0 

CGGCGCCGAG GTCGTGGGGG TAGTCCAGGG CTTGGGGTTC GTGGGATGAG GGCTCGGGGT 360 

ACGGCGCCGG TCCGTTGGTG CCGACACCGG GGTTCGGCGA GTGGGGACCG GGCATTGTGG 1020 

TTCTCCTAGG GTGGTGGACG GGACCAGCTG CTAGGGCGAC AACCGCCCGT CGCGTCAGCC 1080 

GGCAGCATCG GCAATCAGGT GAGCTCCCTA GGCAGGCTAG CGCAACAGCT GCCGTCAGCT 1140 

CTCAACGCGA CGGGGCGGGC CGCGGCGCCG ATAATGTTGA AAGACTAGGC AACCTTAGGA 12 00 

ACGAAGGACG GAGATTTTGT GACGATC 1227 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 181 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
GCGGTGTCGG CGGATCCGGC GGGTGGTTGA ACGGCAACGG CGGGGCCGGC GGGGCCGGCG 60 
GGACCGGCGC TAACGGTGGT GCCGGCGGCA ACGCCTGGTT GTTCGGGGCC GGCGGGTCCG 120 
GCGGNGCCGG CACCAATGGT GGNGTCGGCG GGTCCGGCGG ATTTGTCTAC GGCAACGGCG 180 

G 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 290 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 

GCGGTGTCGG CGGATCCGGC GGGTGGTTGA ACGGCAACGG CGGTGTCGGC GGCCGGGGCG 60 

GCGACGGCGT CTTTGCCGGT GCCGGCGGCC AGGGCGGCCT CGGTGGGCAG GGCGGCAATG 120 

GCGGCGGCTC CACCGGCGGC AACGGCGGTC TTGGCGGCGC GGGCGGTGGC GGAGGCAACG 180 

CCCCGGACGG CGGCTTCGGT GGCAACGGCG GTAAGGGTGG CCAGGGCGGN ATTGGCGGCG 24 0 

GCACTCAGAG CGCGACCGGC CTCGGNGGTG ACGGCGGTGA CGGCGGTGAC 290 

;2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS; 

lA) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TO PO LOGY : linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 33: 
GATCCAGTGG CATGGNGGGT GTCAGTGGAA GCAT 
(2) INFORMATION FOR SEQ ID NO : 3 9 : 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 155 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(XI } SEQUENCE DESCRIPTION: SEQ ID NO : 3 9 : 
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GATCGCTGCT CGTCCCCCCC TTGCCGCCGA CGCCACCGGT CCCACCGTTA CCGAACAAGC 60 

TGGCGTGGTC GCCAGCACCC CCGGCACCGC CGACGCCGGA GTCGAACAAT GGCACCGTCG 120 

TATCCCCACC ATTGCCGCCG GNCCCACCGG CACCG 155 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
ATGGCGTTCA CGGGGCGCCG GGGACCGGGC AGCCCGGNGG GGCCGGGGGG TGG 53 
(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:41: 

GATCCACCGC 3GGTGCAGAC GGTGCCCGCG GCGCCACCCC GACCAGCGGC GGCAACGGCG 60 

GCACCGGCGG CAACGGCGCG AACGCCACCG TCGTCGGNGG GGCCGGCGGG GCCGGCGGCA 12 0 

AGGGCC-GCAA CG -^3 2 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 base pairs 
(3) TYPE: nucleic acid 
(C; STRANDEDNESS: single 
\D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

GATCGGCGGC CGGNACGGNC GGGGACGGCG GCAAGGGCGG NAACGGGGGC GCCGNAGCCA 60 

CCNGCCAAGA ATCCTCCGNG TCCNCCAATG GCGCGAATGG CGGACAGGGC GGCAACGGCG 12 0 

GCANCGGCGG CA -I_32 

(2) INFORMATION FOR SEQ ID NO:43: 



li) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 702 oase pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:43: 

CGGCACGAGG ATCGGTACCC CGCGGCATCG GCAGCTGCCG ATTCGCCGGG TTTCCCCACC 60 

CGAGGAAAGC CGCTACCAGA TGGCGCTGCC GAAGTAGGGC GATCCGTTCG CGATGCCGGC 120 

ATGAACGGGC GGCATCAAAT TAGTGCAGGA ACCTTTCAGT TTAGCGACGA TAATGGCTAT 180 

AGCACTAAGG AGGATGATCC GATATGACGC AGTCGCAGAC CGTGACGGTG GATCAGCAAG 240 

AGATTTTGAA CAGGGCCAAC GAGGTGGAGG CCCCGATGGC GGACCCACCG ACTGATGTCC 3 00 

CCATCACACC GTGCGAACTC ACGGNGGNTA AAAACGCCGC CCAACAGNTG GTNTTGTCCG 3 60 

CCGACAACAT GCGGGAATAC CTGGCGGCCG GTGCCAAAGA GCGGCAGCGT CTGGCGACCT 420 

CGCTGCGCAA CGCGGCCAAG GNGTATGGCG AGGTTGATGA GGAGGCTGCG ACCGCGCTGG 4 80 

ACAACGACGG CGAAGGAACT GTGCAGGCAG AATCGGCCGG GGCCGTCGGA GGGGACAGTT 540 

CGGCCGAACT AACCGATACG CCGAGGGTGG CCACGGCCGG TGAACCCAAC TTCATGGATC 600 

TCAAAGAAGC GGCAAGGAAG CTCGAAACGG GCGACCAAGG CGCATCGCTC GCGCACTGNG 660 

GGGATGGGTG GAACACTTNC ACCCTGACGC TGCAAGGCGA CG 702 

'2] INFORMATION FOR SEQ ID NO; 44: 

(1) SEQUENCE CHARACTERISTICS: 

^A) LENGTH: 298 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

1X1 J SEQUENCE DESCRIPTION: SEQ ID NO : 44 : 

GAAGCCGCAG CGCTGTCGGG CGACGTGGCG GTCAAAGCGG CATCGCTCGG TGGCGGTGGA 60 

GGCGGCGGGG TGCCGTCGGC GCCGTTGGGA TCCGCGATCG GGGGCGCCGA ATCGGTGCGG 12 0 

CCCGCTGGCG CTGGTGACAT TGCCGGCTTA GGCCAGGGAA GGGCCGGCGG CGGCGCCGCG 180 

CTGGGCGGCG GTGGCATGGG AATGCCGATG GGTGCCGCGC ATCAGGGACA AGGGGGCGCC 24 0 

AAGTCCAAGG GTTCTCAGCA GGAAGACGAG GCGCTCTACA CCGAGGATCC TCGTGCCG 2 98 
(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1058 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

CGGCACGAGG ATCGAATCGC GTCGCCGGGA GCACAGCGTC GCACTGCACC AGTGGAGGAG 60 

CCATGACCTA CTCGCCGGGT AACCCCGGAT ACCCGCAAGC GCAGCCCGCA GGCTCCTACG 120 

GAGGCGTCAC ACCCTCGTTC GCCCACGCCG ATGAGGGTGC GAGCAAGCTA CCGATGTACC 180 

TGAACATCGC GGTGGCAGTG CTCGGTCTGG CTGCGTACTT CGCCAGCTTC GGCCCAATGT 24 0 

TCACCCTCAG TACCGAACTC GGGGGGGGTG ATGGCGCAGT GTCCGGTGAC ACTGGGCTGC 300 

CGGTCGGGGT GGCTCTGCTG GCTGCGCTGC TTGCCGGGGT GGTTCTGGTG CCTAAGGCCA 360 

AGAGCCATGT GACGGTAGTT GCGGTGCTCG GGGTACTCGG CGTATTTCTG ATGGTCTCGG 42 0 

CGACGTTTAA CAAGCCCAGC GCCTATTCGA CCGGTTGGGC ATTGTGGGTT GTGTTGGCTT 480 

TCATCGTGTT CCAGGCGGTT GCGGCAGTCC TGGCGCTCTT GGTGGAGACC GGCGCTA7CA 54 0 

CCGCGCCGGC GCCGCGGCCC AAGTTCGACC CGTATGGACA GTACGGGCGG TACGGGCAGT 600 

ACGGGCAGTA CGGGGTGCAG CCGGGTGGGT ACTACGGTCA GCAGGGTGCT CAGCAGGCCG 660 

CGGGACTGCA GTCGCCCGGC CCGCAGCAGT CTCCGCAGCC TCCCGGATAT GGGTCGCAGT 72 0 

ACGGCGGCTA TTCGTCCAGT CCGAGCCAAT CGGGCAGTGG ATACACTGCT CAGCCCCCGG 780 

CCCAGCCGCC GGCGCAGTCC GGGTCGCAAC AATCGCACCA GGGCCCATCC ACGCCACCTA 840 

CCGGCTTTCC GAGCTTCAGC CCACCACCAC CGGTCAGTGC CGGGACGGGG TCGCAGGCTG 900 

GTTCGGCTCC AGTCAACTAT TCAAACCCCA GCGGGGGCGA GCAGTCGTCG TCCCCCGGGG 960 

GGGCGCCGGT CTAACCGGGC GTTCCCGCGT CCGGTCGCGC GTGTGCGCGA AGAGTGAACA 102 0 

GGGTGTCAGC AAGCGCGGAC GATCCTCGTG CCGAATTC 10 5 8 
(2) INFORMATION FOR SEQ ID NO: 46: 

li) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 327 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

CGGCACGAGA GACCGATGCC GCTACCCTCG CGCAGGAGGC AGGTAATTTC GAGCGGATCT 6 0 

CCGGCGACCT GAAAACCCAG ATCGACCAGG TGGAGTCGAC GGCAGGTTCG TTGCAGGGCC 12 0 
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AGTGGCGCGG CGCGGCGGGG ACGGCCGCCC AGGCCGCGGT GGTGCGCTTC CAAGAAGCAG 180 

CCAATAAGCA GAAGCAGGAA CTCGACGAGA TCTCGACGAA TATTCGTCAG GCCGGCGTCC 240 

AATACTCGAG GGCCGACGAG GAGCAGCAGC AGGCGCTGTC CTCGCAAATG GGCTTCTGAC 300 

CCGCTAATAC GAAAAGAAAC GGAGCAA 327 
(2) lOTORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO:47: 
CGGTCGCGAT GATGGCGTTG TCGAACGTGA CCGATTCTGT ACCGCCGTCG TTGAGATCAA SO 
CCAACAACGT GTTGGCGTCG GCAAATGTGC CGNACCCGTG GATCTCGGTG ATCTTGTTCT 12 0 

TCTTCATCAG GAAGTGCACA CCGGCCACCC TGCCCTCGGN TACCTTTCGG 17 0 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
3ATCCGGCGG CACGGGGGGT GCCGGCGGCA GCACCGCTGG CGCTGGCGGC AACGGCGGGG 5 0 

CCGGGGGTGG CGGCGGAACC GGTGGGTTGC TCTTCGGCAA CGGCGGTGCC GGCGGGCACG 12 0 

GGGCCGT 127 
(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
CGGCGGCAAG GGCGGCACCG CCGGCAACGG GAGCGGCGCG GCCGGCGGCA ACGGCGGCAA 6 0 

CGGCGGCTCC GGCCTCAACG G 81 
iZ) INFORMATION FOR SEQ ID MO: 50: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 149 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
GATCAGGGCT GGCCGGCTCC GGCCAGAAGG GCGGTAACGG AGGAGCTGCC GGATTGTTTG 
GCAACGGCGG GGCCGC2JGGT GCCGGCGCGT CCAACCAAGC CGGTAACGGC GGNGCCGGCG 
GAAACGGTGG TGCCGGTGGG CTGATCTGG 
(2) INFORMATION FOR SEQ ID NO : 5 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
CGGCACGAGA TCACACCTAC CGAGTGATCG AGATCGTCGG GACCTCGCCC GACGGTGTCG 
ACGCGGNAAT CCAGGGCGGT CTGGCCCGAG CTGCGCAGAC CATGCGCGCG CTGGACTGGT 
TCGAAGTACA GTCAATTCGA GGCCACCTGG TCGACGGAGC GGTCGCGCAC TTCCAGGTGA 
CTATGAAAGT CGGCTTCCGC CTGGAGGATT CCTGAACCTT CAAGCGCGGC CGATAACTGA 
GGTGCATCAT TAAGCGACTT TTCCAGAACA TCCTGACGCG CTCGAAACGC GGTTCAGCCG 
ACGGTGGCTC CGCCGAGGCG CTGCCTCCAA AATCCCTGCG ACAATTCGTC GGCGG 
(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 999 base pairs 

(B) TYPE: nucleic acid 

{ C ) STRANDEDNESS : s ingle 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

ATGCATCACC ATCACCATCA CATGCATCAG GTGGACCCCA ACTTGACACG TCGCAAGGGA 

CGATTGGCGG CACTGGCTAT CGCGGCGATG GCCAGCGCCA GCCTGGTGAC CGTTGCGGTG 

CCCGCGACCG CCAACGCCGA TCCGGAGCCA GCGCCCCCGG TACCCACAAC GGCCGCCTCG 

CCGCCGTCGA CCGCTGCAGC GCCACCCGCA CCGGCGACAC CTGTTGCCCC CCCACCACCG 
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GCCGCCGCCA ACACGCCGAA TGCCCAGCCG GGCGATCCCA ACGCAGCACC TCCGCCGGCC 300 

GACCCGAACG CACCGCCGCC ACCTGTCATT GCCCCAAACG CACCCCAACC TGTCCGGATC 360 

GACAACCCGG TTGGAGGATT CAGCTTCGCG CTGCCTGCTG GCTGGGTGGA GTCTGACGCC 42 0 

GCCCACTTCG ACTACGGTTC AGCACTCCTC AGCAAAACCA CCGGGGACCC GCCATTTCCC 480 

GGACAGCCGC CGCCGGTGGC CAATGACACC CGTATCGTGC TCGGCCGGCT AGACCAAAAG 54 0 

CTTTACGCCA GCGCCGAAGC CACCGACTCC AAGGCCGCGG CCCGGTTGGG CTCGGACATG 600 

GGTGAGTTCT ATATGCCCTA CCCGGGCACC CGGATCAACC AGGAAACCGT CTCGCTCGAC 560 

GCCAACGGGG TGTCTGGAAG CGCGTCGTAT TACGAAGTCA AGTTCAGCGA TCCGAGTAAG 720 

CCGAACGGCC AGATCTGGAC GGGCGTAATC GGCTCGCCCG CGGCGAACGC ACCGGACGCC 780 

GGGCCCCCTC AGCGCTGGTT TGTGGTATGG CTCGGGACCG CCAACAACCC GGTGGACAAG 84 0 

GGCGCGGCCA AGGCGCTGGC CGAATCGATC CGGCCTTTGG TCGCCCCGCC GCCGGCGCCG 900 

GCACCGGCTC CTGCAGAGCC CGCTCCGGCG CCGGCGCCGG CCGGGGAAGT CGCTCCTACC 960 

CCGACGACAC CGACACCGCA GCGGACCTTA CCGGCCTGA 999 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 332 amino acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Met His His His His His His Mec His Gin Val Asp Pro Asn Leu Thr 
^ 5 10 15 

Arg Arg Lys Gly Arg Leu Ala Ala Leu Ala He Ala Ala Met Ala Ser 
20 25 30 

Ala Ser Leu Val Thr Val Ala Val Pro Ala Thr Ala Asn Ala Asp Pro 
35 40 45 

Glu Pro Ala Pro Pro Val Pro Thr Thr Ala Ala Ser Pro Pro Ser Thr 
50 55 60 

Ala Ala Ala Pro Pro Ala Pro Ala Thr Pro Val Ala Pro Pro Pro Pro 
65 70 75 80 

Ala Ala Ala Asn Thr Pro Asn Ala Gin Pro Gly Asp Pro Asn Ala Ala 
85 90 95 

Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro Pro Pro Val He Ala Pro 
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100 105 110 

Asn Ala Pro Gin Pro Val Arg He Asp Asn Pro Val Gly Gly Phe Ser 
115 120 125 

Phe Ala Leu Pro Ala Gly Trp Val Glu Ser Asp Ala Ala His Phe Asp 
130 135 140 

Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr Gly Asp Pro Pro Phe Pro 
145 ISO 155 160 

Gly Gin Pro Pro Pro Val Ala Asn Asp Thr Arg He Val Leu Gly Arg 
165 170 175 

Leu Asp Gin Lys Leu Tyr Ala Ser Ala Glu Ala Thr Asp Ser Lys Ala 
lao 185 190 

Ala Ala Arg Leu Gly Ser Asp Met Gly Glu Phe Tyr Met Pro Tyr Pro 
195 200 205 

Gly Thr Arg He Asn Gin Glu Thr Val Ser Leu Asp Ala Asn Gly Val 
210 215 220 

Ser Gly Ser Ala Ser Tyr Tyr Glu Val Lys Phe Ser Asp Pro Ser Lys 
225 230 235 240 

Pro Asn Gly Gin He Trp Thr Gly Val He Gly Ser Pro Ala Ala Asn 
245 250 255 

Ala Pro Asp Ala Gly Pro Pro Gin Arg Trp Phe Val Val Trp Leu Gly 
260 265 270 

Thr Ala Asn Asn Pro Val Asp Lys Gly Ala Ala Lys Ala Leu Ala Glu 
275 280 285 

Ser He Arg Pro Leu Val Ala Pro Pro Pro Ala Pro Ala Pro Ala Pro 
290 295 300 

Ala Glu Pro Ala Pro .Ala Pro Ala Pro Ala Gly Glu Val Ala Pro Thr 
305 310 315 320 

Pro Thr Thr Pro Thr Pro Gin Arg Thr Leu Pro Ala 
325 330 

) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(x::; SEQUENCE DESCRIPTION: SEQ ID NO: 54: 



Asp Pro Val Asp Ala Val He Asn Thr Thr Xaa Asn Tyr Gly Gin Val 
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^5 10 15 

Val Ala Ala Leu 

20 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:55: 

Ala Val Glu Ser Gly Met Leu Ala Leu Gly Thr Pro Ala Pro Ser 

^5 10 15 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 
{B} TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala Ala Lys 
- 5 10 15 

Glu Gly Arg 



;2) INFORMATION FOR SEQ ID NO: 57: 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 ammo acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: Imear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp Pro Ala Trp Gly Pro 
^5 10 15 

(2) INFORMATION FOR SEQ ID NO: 58: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:58: 

Asp lie Gly Ser Glu Ser Thr Glu Asp Gin Gin Xaa Ala Val 
15 10 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Ala Glu Glu Ser lie Ser Thr Xaa Glu Xaa He Val Pro 

5 10 

2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Ala Ala Ala Ala Pro Pro 

IS 10 15 

Ala 



2) INFORMATION FOR SEQ ID NO: 61; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS; 

(D) TOPOLOGY; linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Ala Pro Lys Thr Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 
^5 10 15 

2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY; linear 
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fxi) SEQUENCE DESCRIPTION: SEQ ID NO:62: 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Gin Thr Ser 
15 10 15 

Leu Leu Asn Asn Leu Ala Asp Pro Asp Val Ser Phe Ala Asp 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQtJENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Gly Cys Gly Asp Arg Ser Gly Gly Asn Leu Asp Gin He Arg Leu Arg 
15 10 15 

Arg Asp Arg Ser Gly Gly Asn Leu 
20 

(2) INFORMATION FOR SEQ ID NO : 64 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 187 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

;xi) SEQUENCE DESCRIPTION: SEQ ID JJ0:64: 

Thr Gly Ser Leu Asn Gin Thr His Asn Arg Arg Ala Asn Glu Arg Lys 
15 10 15 

Asn Thr Thr Met Lys Met Val Lys Ser He Ala Ala Gly Leu Thr Ala 
20 25 30 

Ala Ala Ala He Gly Ala Ala Ala Ala Gly Val Thr Ser He Met: .Ala 
35 40 45 

Gly Gly Pro Val Val Tyr Gin Met Gin Pro Val Val Phe Gly Ala Pro 
50 55 60 

Leu Pro Leu Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin 
65 70 75 80 

Leu Thr Ser Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala 
85 90 95 



Asn Lys Gly Ser Leu Val Glu Gly Gly He Gly Gly Thr Glu Ala Arg 
100 105 110 
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lie Ala Asp His Lys Leu Lys hys Ala Ala Glu His Gly Asp Leu Pro 
115 120 125 

Leu Ser Phe Ser Val Thr Asn He Gin Pro Ala Ala Ala Gly Ser Ala 
130 135 140 

Thr Ala Asp Val Ser Val Ser Gly Pro Lys Leu Ser Ser Pro Val Thr 
145 150 155 160 

Gin Asn Val Thr Phe Val Asn Gin Gly Gly Trp Met Leu Ser Arg Ala 
165 170 175 

Ser Ala Met Glu Leu Leu Gin Ala Ala Gly Xaa 
180 185 

(2) INFORMATION FOR SEQ ID MO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 148 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

Asp Glu Val Thr Val Glu Thr Thr Ser Val Phe Arg Ala Asp Phe Leu 
^5 10 15 

Ser Glu Leu Asp Ala Pro Ala Gin Ala Gly Thr Glu Ser Ala Val Ser 
20 25 30 

Gly Val Glu Gly Leu Pro Pro Gly Ser Ala Leu Leu Val Val Lys Arg 
35 40 45 

Gly Pro Asn Ala Gly Ser Arg Phe Leu Leu Asp Gin Ala He Thr Ser 
50 55 60 

Ala Gly Arg His Pro Asp Ser Asp lie Phe Leu Asp Asp Val Thr Val 
SS 70 75 80 

Ser Arg Arg His Ala Glu Phe Arg Leu Glu Asn Asn Glu Phe Asn Val 
35 90 95 

Val Asp Val Gly Ser Leu Asn Gly Thr Tyr Val Asn Arg Glu Pro Val 
100 105 110 

Asp Ser Ala Val Leu Ala Asn Gly Asp Glu Val Gin He Gly Lys Leu 
115 120 125 

Arg Leu Val Phe Leu Thr Gly Pro Lys Gin Gly Glu Asp Asp Gly Ser 

130 135 



Thr Gly Gly Pro 
14 5 
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(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 



Thr Ser Asn Arg Pro Ala Arg Arg Gly Arg Arg Ala Pro Arg Asp Thr 
15 10 15 

Gly Pro Asp Arg Ser Ala Ser Leu Ser Leu Val Arg His Arg Arg Gin 
20 25 30 

Gin Arg Asp Ala Leu Cys Leu Ser Ser Thr Gin lie Ser Arg Gin Ser 
35 40 45 

Asn Leu Pro Pro Ala Ala Gly Gly Ala Ala Asn Tyr Ser Arg Arg Asn 
50 55 60 



Phe Asp Val Arg lie Lys lie Phe Met Leu Val Thr Ala Val Val Leu 
65 70 75 



80 



Leu Cys Cys Ser Gly Val Ala Thr Ala Ala Pro Lys Thr Tyr Cys Glu 
85 90 9S 



Glu Leu Lys Gly Thr Asp Thr Gly Gin Ala Cys Gin He Gin Met Ser 
100 105 110 



Asp Pro Ala Tyr 
115 

Gin Lys Ser Leu 
130 

Ser Ala Ala Thr 
145 

He Thr Ser Ala 



Ala Val Val Leu 
180 



Asn He Asn lie 

120 

Glu Asn Tyr lie 
135 

Ser Ser Thr Pro 
ISO 

Thr Tyr Gin Ser 
165 

Xaa Val Tyr His 



Ser Leu Pro Ser 



Ala Gin Thr Arg 
140 

Arg Glu Ala Pro 
155 

Ala He Pro Pro 
170 

Asn Ala Gly Gly 
185 



Tyr Tyr Pro Asp 
125 

Asp Lys Phe Leu 



Tyr Glu Leu Asn 

160 

Arg Gly Thr Gin 
175 

Thr His Pro Thr 
190 



Thr Thr Tyr Lys 
195 

Thr Tyr Asp Thr 
210 



Ala Phe Asp Trp 

200 

Leu Trp Gin Ala 
215 



Asp Gin Ala Tyr 

Asp Thr Asp Pro 

220 



Arg Lys Pro He 
205 

Leu Pro Val Val 



Phe Pro He Val Ala Arg 
225 230 
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(2) INFORMATION FOR SEQ ID MO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Thr Ala Ala Ser Asp Asn Phe Gin Leu Ser Gin Gly Giy Gin Gly Phe 
15 10 15 

Ala He Pro He Gly Gin Ala Met Ala He Ala Gly Gin He Arg Ser 
20 25 30 

Gly Gly Gly Ser Pro Thr Val His He Gly Pro Thr Ala Phe Leu Gly 
35 40 45 

Leu Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arg Val Gin Arg Val 
50 55 60 

Val Gly Ser Ala Pro Ala Ala Ser Leu Gly He Ser Thr Gly Asp Val 
65 70 75 80 

He Thr Ala Val Asp Gly Ala Pro He Asn Ser Ala Thr Ala Met Ala 
85 90 95 

Asp Ala Leu Asn Gly His His Pro Gly Asp Val He Ser Val Asn Trp 
100 105 110 

Gin Thr Lys Ser Gly Gly Thr Arg Thr Gly Asn Val Thr Leu Ala Glu 
H5 120 125 

Gly Pro Pro Ala 
13 0 

(2) INFORMATION FOR SEQ ID NO: 68: 

;i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 amino acids 

(B) TYPE; ammo acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

Val Pro Leu Arg Ser Pro Ser Met Ser Pro Ser Lys Cys Leu Ala Ala 

15 10 15 

Ala Gin Arg Asn Pro Val He Arg Arg Arg Arg Leu Ser Asn Pro Pro 

20 25 30 



Pro Arg Lys Tyr Arg Ser Met Pro Ser Pro Ala Thr Ala Ser Ala Gly 
35 40 45 
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Met Ala Arg Val Arg Arg Arg Ala He Trp Arg Gly Pro Ala Thr Xaa 
50 55 60 



Ser Ala Gly Met Ala Arg Val Arg Arg Trp Xaa Val Met Pro Xaa Val 
SS 70 75 80 

He Gin Ser Thr Xaa He Arg Xaa Xaa Gly Pro Phe Asp Asn Arg Gly 
85 90 95 



Ser Glu Arg Lys 
100 



(2) INFORMATION FOR SEQ ID NO: 69: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 163 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 



Met Thr Asp Asp 
1 

Leu Thr Leu Asn 
20 

Arg Asp Arg Phe 

35 

lie Asp Val Val 
50 

Leu Asp Leu Lys 

55 

Thr Ala Val Gly 



Arg Arg Gly His 
100 

Asp Arg Leu Arg 

115 

Ala Ala Ala His 

13 0 

His Arg Xaa Gly 
14 5 



He Leu Leu He 
5 

Arg Pro Gin Ser 



Phe Ala Xaa Leu 
40 

He Leu Thr Gly 
55 

Val Ala Gly Arg 

70 

Gly His Asp Gin 
35 

Arg Arg Ala Arg 



Ala Arg Pro Leu 

120 

Leu Gly Thr Gin 
135 

Pro Val Asp Glu 
150 



Asp Thr Asp Glu 

10 

Arg Asn Ala Leu 
25 

Xaa Asp Ala Glu 



Ala Asp Pro Val 

60 

Ala Asp Arg Ala 
75 

Ala Gly Asp Arg 
90 

Thr Gly Ala Val 
105 

Arg Arg His Pro 



Cys Val Leu Ala 

140 

Pro Asp Arg Arg 
155 



Arg Val Arg Thr 
15 

Ser Ala Ala Leu 
30 

Xaa Asp Asp Asp 
45 

Phe Cys Ala Gly 



Ala Gly His Leu 
30 

Arg Asp Gin Arg 

95 

Leu Arg His Pro 
110 

Arg Pro Gly Gly 
125 

Ala Lys Gly Arg 



Leu Pro Val Arg 

160 



Asp Arg Arg 
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(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 344 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:70: 

Met Lys Phe Val Asn His lie Glu Pro Val Ala Pro Arg Arg Ala Gly 
^5 10 15 

Gly Ala Val Ala Glu Val Tyr Ala Glu Ala Arg Arg Glu Phe Gly Arg 
20 25 30 

Leu Pro Glu Pro Leu Ala Met Leu Ser Pro Asp Glu Gly Leu Leu Thr 
35 40 45' 

Ala Gly Trp Ala Thr Leu Arg Glu Thr Leu Leu Val Gly Gin Val Pro 
50 55 60 

Arg Gly Arg Lys Glu Ala Val Ala Ala Ala Val Ala Ala Ser Leu Arg 
65 70 75 80 

Cys Pro Trp Cys Val Asp Ala His Thr Thr Met Leu Tyr Ala Ala Gly 
85 90 95 

Gin Thr Asp Thr Ala Ala Ala He Leu Ala Gly Thr Ala Pro Ala Ala 

105 110 

Gly Asp Pro Asn Ala Pro Tyr Val Ala Trp Ala Ala Gly Thr Gly Thr 
-15 120 125 

Pro Ala Gly Pro Pro Ala Pro Phe Gly Pro Asp Val Ala Ala Glu Tyr 
130 135 " 140 

Leu Gly thr Ala Val Gin Phe His Phe lie Ala Arg Leu Val Leu Val 

150 155 160 

Leu Leu Asp Glu Thr Phe Leu Pro Gly Gly Pro Arg Ala Gin Gin Leu 
165 170 175 

Met Arg Arg Ala Gly Gly Leu Val Phe Ala Arg Lys Val Arg Ala Glu 
180 185 190 

His Arg Pro Gly Arg Ser Thr Arg Arg Leu Glu Pro Arg Thr Leu Pro 
195 200 205 

Asp Asp Leu Ala Trp Ala Thr Pro Ser Glu Pro He Ala Thr Ala Phe 

210 215 220 

Ala Ala Leu Ser His His Leu Asp Thr Ala Pro His Leu Pro Pro Pro 

225 230 235 240 
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Thr Arg Gin Val Val Arg Arg Val 
245 

Mec Pro Met Ser Ser Arg Trp Thr 

260 

Ala Asp Leu His Ala Pro Thr Arg 
275 280 

Pro His Gin Val Thr Asp Asp Asp 
290 295 



Val Gly Ser Trp His Gly Glu Pro 
250 255 

Asn Glu His Thr Ala Glu Leu Pro 
265 270 

Leu Ala Leu Leu Thr Gly Leu Ala 
285 

Val Ala Ala Ala Arg Ser Leu Leu 
300 



Asp Thr Asp Ala Ala Leu Val Gly Ala Leu Ala Trp Ala Ala Phe Thr 
310 315 320 

Ala Ala Arg Arg lie Gly Thr Trp lie Gly Ala Ala Ala Glu Gly Gin 
^25 330 

Val Ser Arg Gin Asn Pro Thr Gly 
340 



{2} INFORMATION FOR SEQ ID NO: 71: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 85 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 



Asp Asp Pro Asp 



Leu Gly Arg Gly 
20 

Ala Arg Leu Gly 
35 

lie Tyr Arg Gin 
50 

Gly Val Arg Asp 
65 

Arg Glu Arg Tyr 



Thr Gly Glu Leu 

100 

Asp Gin Tyr Glu 
115 



Met Pro Gly Thr 

5 

He Ala Pro Val 



Glu Ala Gly Leu 
40 

Arg Arg Ala Glu 
55 

Glu Leu Lys Leu 
70 

Leu Leu His Asp 
85 

Mec Asp Arg Ser 



Pro Gly Ser Ser 
120 



Val Ala Lys Ala 

10 

Glu Asp He Gin 
25 

Asp Asp Val Ala 



Leu Arg Thr Ala 
60 

Ser Leu Ala Ala 
75 

Glu Gin Gly Arg 
90 

Ala Arg Cys Val 

105 

Arg Arg Trp Ala 



Val Ala Asp Ala 

15 

Asp Cys Val Glu 
30 

Arg Val Tyr He 
45 

Lys Ala Leu Leu 



Val Thr Val Leu 
80 

Pro Ala Glu Ser 
95 

Ala Ala Ala Glu 

110 

Glu Arg Phe Ala 

125 
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Thr Leu Leu Arg Asn Leu Glu Phe Leu Pro Aan Ser Pro Thx Leu Met 
130 135 

Asn Ser Gly Thr Asp Leu Gly Leu Leu Ala Gly Cys Phe Val Leu Pro 
150 155 160 

He Glu Asp Ser Leu Gin Ser He Phe Ala Thr Leu Gly Gin Ala Ala 
1S5 170 

Glu Leu Gin Arg Ala Gly Gly Gly Thr Gly Tyr Ala Phe Ser His Leu 
180 185 190 

Arg Pro Ala Gly Asp Arg Val Ala Ser Thr Gly Gly Thr Ala Ser Gly 
195 200 205 

Pro Val Ser Phe Leu Arg Leu Tyr Asp Ser Ala Ala Gly Val Val Se- 
210 215 220 

Mec Gly Gly Arg Arg Arg Gly Ala Cys Met Ala Val Leu Asp Val Se- 
230 235 240 

His Pro Asp He Cys Asp Phe Val Thr Ala Lys Ala Glu Ser Pro Ser 

250 255 

Glu Leu Pro His Phe Asn Leu Ser Val Gly Val Thr Asp Ala Phe Leu 

265 270 

Arg Ala Val Glu Arg Asn Gly Leu His Arg Leu Val Aan Pro Arg Thr 
^■^S 280 285 

Gly Lys lie Val Ala Arg Met Pro Ala Ala Glu Leu Phe Asp Ala He 

295 300 

Cys Lys Ala Ala His Ala Gly Gly Asp Pro Glv Leu Val Phe Leu Asp 
310 315 320 

Thr :ie Asn Arg Ala Asn Pro Val Pro Gly Arg Gly Arg lie Glu Ala 
325 330 

Thr Asn Pro Cys Gly Glu Val Pro Leu Leu Pro Tyr Glu Ser Cys Asn 
340 345 350 

Leu Gly Ser He Asn Leu Ala Arg Met Leu Ala Asp Gly Arg Val Asp 
355 360 365 

Trp Asp Arg Leu Glu Glu Val Ala Gly Val Ala Val Arg Phe Leu Aso 

375 380 

Asp val He Asp Val Ser Arg Tyr Pro Phe Pro Glu Leu Gly Glu Ala 
390 395 

Ala Arg Ala Thr Arg Lys He Gly Leu Gly Val Met Gly Leu Ala Glu 
405 410 415 
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Leu Leu Ala Ala Leu Gly He Pro Tyx Asp Ser Glu Glu Ala Val Arg 
420 425 430 

Leu Ala Thr Arg Leu Met Arg Arg He Gin Gin Ala Ala His Thr Ala 
435 440 445 

Ser Arg Arg Leu Ala Glu Glu Arg Gly Ala Phe Pro Ala Phe Thr Asp 
450 455 460 

Ser Arg Phe Ala Arg Ser Gly Pro Arg Arg Asn Ala Gin Val Thr Ser 
465 470 475 480 

Val Ala Pro Thr Gly 
485 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 72 : 

Gly Val He Val Leu Asp Leu Glu Pro Arg Gly Pro Leu Pro Thr Glu 
15 10 15 

He Tyr Trp Arg Arg Arg Gly Leu Ala Leu Gly He Ala Val Val Val 
20 25 30 

Val Gly He Ala Val Ala He Val He Ala Phe Val Asp Ser Ser Ala 
35 40 45 

Gly Ala Lys Pro Val Ser Ala Asp Lys Pro Ala Ser Ala Gin Ser His 
50 55 60 

Pro Gly Ser Pro Ala Pro Gin Ala Pro Gin Pro Ala Gly Gin Thr Glu 
65 70 75 80 

Gly Asn Ala Ala Ala Ala Pro Pro Gin Gly Gin Asn Pro Glu Thr Pro 
85 90 95 

Thr Pro Thr Ala Ala Val Gin Pro Pro Pro Val Leu Lys Glu Gly Asp 
100 105 110 

Asp Cys Pro Asp Ser Thr Leu Ala Val Lys Gly Leu Thr Asn Ala Pro 
115 120 125 

Gin Tyr Tyr Val Gly Asp Gin Pro Lys Phe Thr Met Val Val Thr Asn 
130 135 140 



He Gly Leu Val Ser Cys Lys Arg Asp Val Gly Ala Ala Val Leu Ala 
145 150 155 160 
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Ala Tyr Val Tyr Ser Leu Asp Asn Lys Arg Leu Trp Ser Asn Leu Asp 

165 170 175 

Cys Ala Pro Ser Asn Glu Thr Leu Val Lys Thr Phe Ser Pro Gly Glu 
180 185 190 

Gin Val Thr Thr Ala Val Thr Trp Thr Gly Met Gly Ser Ala Pro Arg 
195 200 205 

Cys Pro Leu Pro Arg Pro Ala He Gly Pro Gly Thr Tyr Asn Leu Val 
210 215 220 

Val Gin Leu Gly Asn Leu Arg Ser Leu Pro Val Pro Phe He Leu Asn 
225 230 235 240 

Gin Pro Pro Pro Pro Pro Gly Pro Val Pro Ala Pro Gly Pro Ala Gin 
245 250 255 

Ala Pro Pro Pro Glu Ser Pro Ala Gin Gly Gly 
260 265 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 97 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

Leu He Ser Thr Gly Lys Ala Ser His Ala Ser Leu Gly Val Gin Val 
5 10 15 

Thr Asn Asp Lys Asp Thr Pro Gly Ala Lys He Val Glu Val Val Ala 
20 25 30 

Gly Gly Ala Ala Ala Asn Ala Gly Val Pro Lys Gly Val Val Val Thr 
35 40 45 

Lys Val Asp Asp Arg Pro He Asn Ser Ala Asp Ala Leu Val Ala Ala 
50 55 60 

Val Arg Ser Lys Ala Pro Gly Ala Thr Val Ala Leu Thr Phe Gin Asp 
65 70 75 80 

Pro Ser Gly Gly Ser Arg Thr Val Gin Val Thr Leu Gly Lys Ala Glu 
85 90 95 

Gin 



(2) INFORMATION FOR SEQ ID NO: 74: 
(i) SEQUENCE CHARACTERISTICS 
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(A) LENGTH: 3 64 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO : 74 : 

Gly Ala Ala Val Ser Leu Leu Ala Ala Gly Thr Leu Val Leu Thr Ala 
1 5 .10 15 

Cys Gly Gly Gly Thr Asn Ser Ser Ser Ser Gly Ala Gly Gly Thr Ser 
20 25 30 

Gly Ser Val His Cys Gly Gly Lys Lys Glu Leu His Ser Ser Gly Ser 
35 40 45 

Thr Ala Gin Glu Asn Ala Met Glu Gin Phe Val Tyr Ala Tyr Val Arg 
50 55 60 

Ser Cys Pro Gly Tyr Thr Leu Asp Tyr Asn Ala Asn Gly Ser Gly Ala 
S5 70 75 80 

Gly Val Thr Gin Phe Leu Asn Asn Glu Thr Asp Phe Ala Gly Ser Asp 
85 90 95 

Val Pro Leu Asn Pro Ser Thr Gly Gin Pro Asp Arg Ser Ala Glu Arg 
100 105 110 

Cys Gly Ser Pro Ala Trp Asp Leu Pro Thr Val Phe Gly Pro He Ala 
115 120 125 

lie Thr Tyr Asn He Lys Gly Val Ser Thr Leu Asn Leu Asp Gly Pro 
130 135 140 

Thr Thr Ala Lys He Phe Asn Gly Thr He Thr Val Trp Asn Asd Pro 
145 150 155 " 160 

Gin He Gin Ala Leu Asn Ser Gly Thr Asp Leu Pro Pro Thr Pro He 
165 170 175 

Ser Val He Phe Arg Ser Asp Lys Ser Gly Thr Ser Asp Asn Phe Gin 
ISO IBS 190 

Lys Tyr Leu Asp Gly Val Ser Asn Gly Ala Trp Gly Lys Gly Ala Ser 
195 200 205 

Glu Thr Phe Ser Gly Gly Val Gly Val Gly Ala Ser Gly Asn Asn Gly 
210 215 220 

Thr Ser Ala Leu Leu Gin Thr Thr Asp Gly Ser He Thr Tyr Asn Glu 
225 230 235 240 



Trp Ser Phe Ala Val Gly Lys Gin Leu Asn Met Ala Gin He He Thr 

245 250 255 
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Ser Ala Gly Pro Asp Pro Val Ala lie Thr Thr Glu Ser Val Gly Lys 
260 265 270 

Thr He Ala Gly Ala Lys He Met Gly Gin Gly Asn Asp Leu Val Leu 
275 280 285 

Asp Thr Ser Ser Phe Tyr Arg Pro Thr Gin Pro Gly Ser Tyr Pro He 
290 295 300 

Val Leu Ala Thr Tyr Glu He Val Cys Ser Lys Tyr Pro Asp Ala Thr 
305 310 3X5 320 

Thr Gly Thr Ala Val Arg Ala Phe Met Gin Ala Ala He Gly Pro Gly 
325 330 335 

Gin Glu Gly Leu Asp Gin Tyr Gly Ser He Pro Leu Pro Lys Ser Phe 
340 345 350 

Gin Ala Lys Leu Ala Ala Ala Val Asn Ala He Ser 
355 360 

INFORMATION FOR SEQ ID NO; 75: 

Ci) SEQUENCZ CHARACTERISTICS: 

(A) LENGTH: 309 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

Gin Ala Ala Ala Gly Arg Ala Val Arg Arg Thr Gly His Ala Glu Asp 

- 5 10 15 

Gin Thr His Gin Asp Arg Leu His His Gly Cys Arg Arg Ala Ala Val 
20 25 30 

Val Val Arg Gin Asp Arg Ala Ser Val Ser Ala Thr Ser Ala Arg Pro 
35 40 45 

Pro Arg Arg His Pro Ala Gin Gly His Arg Arg Arg Val Ala Pro Ser 
50 55 60 

Gly Gly Arg Arg Arg Pro His Pro His His Val Gin Pro Asp Asp Arg 
65 70 75 80 

Arg Asp Arg Pro Ala Leu Leu Asp Arg Thr Gin Pro Ala Glu His Pro 
85 90 95 

Asp Pro His Arg Arg Gly Pro Ala Asp Pro Gly Arg Val Arg Gly Arg 
100 105 110 

Gly Arg Leu Arg Arg Val Asp Asp Gly Arg Leu Gin Pro Asp Arg Asp 
115 120 125 
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Ala Asp His Gly Ala Pro Val Arg Gly Arg Gly Pro His Arg Gly Val 
130 135 140 

Gin His Arg Gly Gly Pro Val Phe Val Arg Arg Val Pro Gly Val Axg 
145 150 155 160 

Cys Ala His Arg Arg Gly His Arg Arg Val Ala Ala Pro Gly Gin Gly 
165 170 175 

Asp Val Leu Arg Ala Gly Leu Arg Val Glu Arg Leu Arg Pro Val Ala 
180 185 190 

Ala Val Glu Asn Leu His Arg Gly Ser Gin Arg Ala Asp Gly Arg Val 
155 200 205 

Phe Arg Pro He Arg Arg Gly Ala Arg Leu Pro Ala Arg Arg Ser Arg 
210 215 220 

Ala Gly Pro Gin Gly Arg Leu His Leu Asp Gly Ala Gly Pro Ser Pro 
225 230 235 240 

Leu Pro Ala Arg Ala Gly Gin Gin Gin Pro Ser Ser Ala Gly Gly Arg 
245 250 255 

Arg Ala Gly Gly Ala Glu Arg Ala Asp Pro Gly Gin Arg Gly Arg His 
260 265 270 

His Gin Gly Gly His Asp Pro Gly Arg Gin Gly Ala Gin Arg Gly Thr 
275 280 285 

Ala Gly Val Ala His Ala Ala Ala Gly Pro Arg Arg Ala Ala Val Arg 
290 295 300 

Asn Arg Pro Arg Arg 
305 

INFORMATION FOR SEQ ID NO: 76; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 80 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Ser Ala Val Trp Cys Leu Asn Gly Phe Thr Gly Arg His Arg His Gly 

^5 10 15 

Arg Cys Arg Val Arg Ala Ser Gly Trp Arg Ser Ser Asn Arg Trp Cys 
20 25 30 

Ser Thr Thr Ala Asp Cys Cys Ala Ser Lys Thr Pro Thr Gin Ala Ala 
35 40 45 
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Ser Pro Leu Glu Arg Arg Phe Thr Cys Cys Ser Pro Ala Val Gly Cys 
50 55 60 

Arg Phe Arg Ser Phe Pro Val Arg Arg Leu Ala Leu Glv Ala Arg Thr 
70 75 ■ 80 

Ser Arg Thr Leu Gly Val Arg Arg Thr Leu Ser Gin Trp Asn Leu Ser 
85 90 55 

Pro Arg Ala Gin Pro Ser Cys Ala Val Thr Val Glu Ser His Thr His 
100 105 110 

Ala Ser Pro Arg Met Ala Lys Leu Ala Arg Val Val Gly Leu Val Gin 
115 120 125 

Glu Glu Gin Pro Ser Asp Met Thr Asn His Pro Arg Tyr Ser Pro Pro 
130 135 140 

Pro Gin Gin Pro Gly Thr Pro Gly Tyr Ala Gin Gly Gin Gin Gin Thr 
ISO 155 

Tyr Ser Gin Gin Phe Asp Trp Arg Tyr Pro Pro Ser Pro Pro Pro Gin 
165 170 175 

Pro Thr Gin Tyr Arg Gin Pro Tyr Glu Ala Leu Gly Gly Thr Arg Pro 
180 185 190 

Gly Leu lie Pro Gly Val He Pro Thr Met Thr Pro Pro Pro Gly Met 
1^5 200 205 

Val Arg Gin Arg Pro Arg Ala Gly Met Leu Ala He Glv Ala Val Thr 
210 215 220 

lie Ala Val Val Ser Ala Gly lie Gly Gly Ala Ala Ala Ser Leu Val 
230 235 240 

Gly Phe Asn Arg Ala Pro Ala Gly Pro Ser Gly Gly Pro Val Ala Ala 
245 250 255 

Ser Ala Ala Pro Ser He Pro Ala Ala Asn Met Pro Pro Gly Ser Val 
260 265 270 

Glu Gin Val Ala Ala Lys Val Val Pro Ser Val Val Met Leu Glu Thr 
2*^5 280 285 

Asp Leu Gly Arg Gin Ser Glu Glu Gly Ser Gly lie He Leu Ser Ala 
290 295 300 

Glu Gly Leu He Leu Thr Asn Asn His Val He Ala Ala Ala Ala Lys 
310 3X5 320 

Pre Pro Leu Gly Ser Pro Pro Pro Lys Thr Thr Val Thr Phe Ser Asp 
325 330 335 

Gly Arg Thr Ala Pro Phe Thr Val Val Gly Ala Asp Pro Thr Ser Asp 
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340 345 350 

lie Ala Val Val Arg Val Gin Gly Val Ser Gly hQU Thr Pro lie Ser 
355 360 365 

Leu Gly Ser Ser Ser Asp Leu Arg Val Gly Gin Pro Val Leu Ala He 
370 375 380 

Gly Ser Pro Leu Gly Leu Glu Gly Thr Val Thr Thr Gly He Val Ser 
385 390 395 400 

Ala Leu Asn Arg Pro Val Ser Thr Thr Gly Glu Ala Gly Asn Gin Asn 
405 410 415 

Thr Val Leu Asp Ala Tie Gin Thr Asp Ala Ala He Asn Pro Gly Asn 
420 425 430 

Ser Gly Gly Ala Leu Val Asn Met Asn Ala Gin Leu Val Gly Val Asn 
435 440 445 

Ser Ala He Ala Thr Leu Gly Ala Asp Ser Ala Asp Ala Gin Ser Gly 
450 455 460 

Ser He Gly Leu Gly Phe Ala He Pro Val Asp Gin Ala Lys Arg He 
465 470 475 480 

Ala Asp Glu Leu He Ser Thr Gly Lys Ala Ser His Ala Ser Leu Gly 
485 490 495 

Val Gin Val Thr Asn Asp Lys Asp Thr Pro Gly Ala Lys He Val Glu 
500 505 510 

Val Val Ala Gly Gly Ala Ala Ala Asn Ala Gly Val Pro Lys Gly Val 
515 520 525 

Val Val Thr Lys Val Asp Asp .Arg Pro He Asn Ser Ala Asp Ala Leu 
530 535 540 

Val Ala Ala Val Arg Ser Lys Ala Pro Gly Ala Thr Val Ala Leu Thr 
545 550 555 560 

Phe Gin Asp Pro Ser Gly Gly Ser Arg Thr Val Gin Val Thr Leu Gly 
565 570 575 

Lys Ala Glu Gin 
580 

INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 3 amino acids 

(B) TYPE: amino acid 

iC) STRANDEDNESS : Single 
(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

Met Asn Asp Gly Lys Arg Ala Val Thr Ser Ala Val Leu Val Val Leu 

^5 10 15 

Gly Ala Cys Leu Ala Leu Trp Leu Ser Gly Cys Ser Ser Pro Lys Pro 
20 25 30 

Asp Ala Glu Glu Gin Gly Val Pro Val Ser Pro Thr Ala Ser Asp Pro 
35 40 45 

Ala Leu Leu Ala Glu He Arg Gin Ser Leu Asp Ala Thr Lys Gly Leu 
50 55 60 

Thr Ser Val His Val Ala Val Arg Thr Thr Gly Lys Val Asp Ser Leu 
^2 70 75 ^ go 

Leu Gly He Thr Ser Ala Asp Val Asp Val Arg Ala Asn Pro Leu Ala 
85 90 95 

Ala Lys Gly Val Cys Thr Tyr Asn Asp Glu Gin Gly Val Pro Phe Arg 
100 105 110 

Val Gin Gly Asp Asn He Ser Val Lys Leu Phe Asp Asp Trp Ser Asn 
115 120 125 

Leu Gly Ser Xle Ser Glu Leu Ser Thr Ser Arg Val Leu Asp Pro Ala 
130 135 140 

Ala Gly Val Thr Gin Leu Leu Ser Gly Val Thr Asn Leu Gin Ala Gin 
150 155 160 

Gly Thr Glu Val He Asp Gly He Ser Thr Thr Lys He Thr Gly Thr 
165 170 

He Pro Ala Ser Ser Val Lys Met Leu Asp Pro Glv Ala Lys Ser Ala 
ISO las * 190 

Arg Pro Ala Thr Val Trp He Ala Gin Asp Gly Ser His His Leu Val 
1^5 200 205 

Arg Ala Ser He Asp Leu Gly Ser Gly Ser Xle Gin Leu Thr Gin Ser 
210 215 220 

Lys Trp Asn Glu Pro Val Asn Val Asp 
225 230 



) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 amino acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO :7a: 

Val He Asp He He Gly Thr Ser Pro Thr Ser Trp Glu Gin Ala Ala 

^5 10 15 

Ala Glu Ala Val Gin Arg Ala Arg Asp Ser Val Asp Asp He Arg Val 
20 25 30 

Ala Arg Val He Glu Gin Asp Met Ala Val Asp Ser Ala Gly Lys He 
35 40 45 

Thr Tyr Arg He Lys Leu Glu Val Ser Phe Lys Met Arg Pro Ala Gin 

50 55 60 

Pro Arg 
65 

;2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 69 amino acids 

(B) TYPE: amino acid 

CO STRANDEDNESS : single 
(D) TOPOLOGY; linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:79: 

Val Pro Pro Ala Pro Pro Leu Pro Pro Leu Pro Pro Ser Pro He Ser 
^5 10 15 

Cys Ala Ser Pro Pro Ser Pro Pro Leu Pro Pro Ala Pro Pro Val Ala 
20 25 30 

Pro Gly Pro Pro Met Pro Pro Leu Asp Pro Trp Pro Pro Ala Pro Pro 
35 40 45 

Leu Pro TV- Ser Thr Pro Pro Gly Ala Pro Leu Pro Pro Ser Pro Pro 
5 0 55 60 

Ser Pro Pro Leu Pro 

55 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 5 amino acids 

(B) TYPE: amino acid 

iC) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Met Ser Asn Ser Arg Arg Arg Ser Leu Arg Trp Ser Trp Leu Leu Ser 

- 5 10 15 
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Val I.eu Ala Ala Val Gly Leu Gly Leu Ala Thr Ala Pro Ala Gin Ala 
2° 25 30 

Ala Pro Pro Ala Leu Ser Gin Asp Arg Phe Ala Asp Phe Pro Ala Leu 
35 40 45 

Pro Leu Asp Pro Ser Ala Met Val Ala Gin Val Ala Pro Gin Val Val 



50 



55 



60 



Asn He Asn Thr Lys Leu Gly Tyr Asn Asn Ala Val Gly Ala Gly Thr 
65 70 75 



80 



Gly lie Val He Asp Pro Asn Gly Val Val Leu Thr Asn Asn His Val 



35 



90 



95 



He Ala Gly Ala Thr Asp He Asn Ala Phe Ser Val Gly Ser Gly Gin 



100 



105 



110 



Thr Tyr Gly Val Asp Val Val Gly Tyr Asp Arg Thr Gin Asp Val Ala 



115 



120 



125 



Val Leu Gin Leu Arg Gly Ala Gly Gly Leu Pro Ser Ala Ala He Gly 
130 ^ 



14 0 



Gly Gly Val Ala Val Gly Glu Pro Val Val Ala Mec Gly Asn Ser 



145 



150 



Gly 



155 ISO 
Gly Gin Gly Gly Thr Pro Arg Ala Val Pro Gly Arg Val Val Ala Leu 



165 



170 



175 



Gly Gin Thr Val Gin Ala Ser Asp Ser Leu Thr Gly Ala Glu Glu Thr 

185 190 

Leu Asn Gly Leu He Gin Phe Asp Ala Ala He Gin Pro Gly Asp Ser 



195 



200 



205 



Gly Gly Pro Val Val Asn Gly Leu Gly Gin Val Val Gly Mec Asn Thr 
210 215 



220 



Ala Ala Ser Asp Asn Phe Gin Leu Ser Gin Gly Gly Gin Gly Phe Ala 



225 



230 



235 



240 



He Pro He Gly Gin Ala Met Ala He Ala Gly Gin He Arg Ser Gly 
245 250 



255 



Gly Gly Ser Pro Thr Val His He Gly Pro Thr Ala Phe Leu Gly Leu 



260 



26: 



270 



Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arg Val Gin Arg Val Val 
275 280 285 

Gly Ser Ala Pro Ala Ala Ser Leu Gly He Ser Thr Glv Asp Val He 

295 300 

Thr Ala Val Asp Gly Ala Pro He Asn Ser Ala Thr Ala Met Ala Asd 
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305 310 315 320 

Ala Leu Asn Gly His His Pro Gly Asp Val lie Ser Val Asn Trp Gin 
325 330 335 

Thr Lys Ser Gly Gly Thr Arg Thr Gly Asn Val Thr Leu Ala Glu Gly 
340 345 350 

Pro Pro Ala 
355 

(2) INFORMATION FOR SEQ ID 110:81: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 205 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

Ser Pro Lys Pro Asp Ala Glu Glu Gin Gly Val Pro Val Ser Pro Thr 
15 10 15 

Ala Ser Asp Pro Ala Leu Leu Ala Glu lie Arg Gin Ser Leu Asp Ala 
20 25 30 

Thr Lys Gly Leu Thr Ser Val His Val Ala Val Arg Thr Thr Gly Lys 
35 40 45 

Val Asp Ser Leu Leu Gly lie Thr Ser Ala Asp Val Asp Val Arg Ala 
50 55 60 

Asn Pro Leu Ala Ala Lys Gly Val Cys Thr Tyr Asn Asp Glu Gin Gly 
65 70 75 80 

Val Pro ?he Arg Vai Gin Gly Asp Asn lie Ser Val Lys Leu ?he Asp 
85 90 95 

Asp Trp Ser Asn Leu Gly Ser He Ser Glu Leu Ser Thr Ser Arg Val 
100 105 110 

Leu Asp Pro Ala Ala Gly Val Thr Gin Leu Leu Ser Gly Val Thr Asn 
115 120 125 

Leu Gin Ala Gin Gly Thr Glu Val He Asp Gly He Ser Thr Thr Lys 
130 135 140 

He Thr Gly Thr He Pro Ala Ser Ser Val Lys Met Leu Asp Pro Gly 
145 150 155 160 

Ala Lys Ser Ala Arg Pro Ala Thr Val Trp He Ala Gin Asp Gly Ser 
165 170 175 



His His Leu Val Arg Ala Ser He Asp Leu Gly Ser Gly Ser He Gin 
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180 185 190 

Leu Thr Gin Ser Lys Trp Asn Glu Pro Val Asn Val Asp 

195 200 205 

lOTORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 286 amino acids 

(B) TYPE: aunino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

Gly Asp Ser Phe Trp Ala Ala Ala Asp Gin Met Ala Arg Gly Phe Val 
^5 10 IS 

Leu Gly Ala Thr Ala Gly Arg Thr Thr Leu Thr Gly Glu Gly Leu Gin 
20 25 30 

His Ala Asp Gly His Ser Leu Leu Leu Asp Ala Thr Asn Pro Ala Val 
35 40 45 

Val Ala Tyr Asp Pro Ala Phe Ala Tyr Glu lie Gly Tyr lie Xaa Glu 
50 55 60 

Ser Gly Leu Ala Arg Met Cys Gly Glu Asn Pro Glu Asn lie Phe Phe 
65 70 75 80 

Tyr lie Thr Val Tyr Asn Glu Pro Tyr Val Gin Pro Pro Glu Pro Glu 
85 90 95 

Asn Phe Asp Pro Glu Gly Val Leu Gly Gly He Tyr Arg Tyr His Ala 
100 105 110 

Ala Thr Glu Gin Arg Thr Asn Lys Xaa Gin He Leu Ala Ser Gly Val 
115 120 125 

Ala Met Pro Ala Ala Leu Arg Ala Ala Gin Met Leu Ala Ala Glu Trp 
130 135 140 

Asp Val Ala Ala Asp Val Trp Ser Val Thr Ser Trp Gly Glu Leu Asn 
145 150 155 160 

Arg Asp Gly Val Val He Glu Thr Glu Lys Leu Arg His Pro Asp Arg 
165 170 175 

Pro Ala Gly Val Pro Tyr Val Thr Arg Ala Leu Glu Asn Ala Arg Gly 
180 185 190 

Pro Val He Ala Val Ser Asp Trp Met Arg Ala Val Pro Glu Gin He 
155 200 205 



Arg Pro Trp Val Pro Gly Thr Tyr Leu Thr Leu Gly Thr Asp Gly Phe 
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210 215 220 

Gly Phe Ser Asp Thr Arg Pro Ala Gly Arg Arg Tyr Phe Asn Thr Asp 
225 230 235 240 

Ala Glu Ser Gin Val Gly Arg Gly Phe Gly Arg Gly Trp Pro Gly Arg 
245 250 255 

Arg Val Asn lie Asp Pro Phe Gly Ala Gly Arg Gly Pro Pro Ala Gin 
260 265 270 

Leu Pro Gly Phe Asp Glu Gly Gly Gly Leu Arg Pro Xaa Lys 
275 280 285 

INFORMATION FOR SEQ ID NO: 83: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:83: 

Thr Lys Phe His Ala Leu Met Gin Glu Gin He His Asn Glu Phe Thr 
^ 5 10 15 

Ala Ala Gin Gin Tyr Val Ala He Ala Val Tyr Phe Asp Ser Glu Asp 
20 25 * 30 

Leu Pro Gin Leu Ala Lys His Phe Tyr Ser Gin Ala Val Glu Glu Arg 
35 40 45 

Asn His Ala Met Met Leu Vai Gin His Leu Leu Asp Arg Asn Leu Arg 
50 55 so" 

Vai Glu He Pro Gly Val Asp Thr Val Arg Asn Gin Phe Asp Arg Pro 
^5 70 75 QQ 

Arg Glu Ala Leu Ala Leu Ala Leu Asp Gin Glu Arg Thr Val Thr Asp 
85 90 95 

Gin Val Gly Arg Leu Thr Ala Val Ala Arg Asp Glu Gly Asp Phe Leu 
100 105 110 

Gly Glu Gin Phe Met Gin Trp Phe Leu Gin Glu Gin He Glu Glu Val 

120 125 

Ala Leu Met Ala Thr Leu Val Arg Val Ala Asp Arg Ala Gly Ala Asn 
130 135 

Leu Phe Glu Leu Glu Asn Phe Val Ala Arg Glu Val Asp Val Ala Pro 
^"^^ 150 155 160 



Ala Ala Ser Gly Ala Pro His Ala 



Ala Gly Gly Arg Leu 
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165 170 
(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:84: 

Arg Ala Asp Glu Arg Lys Asn Thr Thr Met Lys Met Val Lys Ser He 

15 10 15 

Ala Ala Gly Leu Thr Ala Ala Ala Ala He Gly Ala Ala Ala Ala Gly 
20 25 30 

Val Thr Ser He Met Ala Gly Gly Pro Val Val Tyr Gin Met Gin Pro 
35 40 45 

Val Val Phe Gly Ala Pro Leu Pro Leu Asp Pro Xaa Ser Ala Pro Xaa 
50 55 60 

Val Pro Thr Ala Ala Gin Trp Thr Xaa Leu Leu Asn Xaa Leu Xaa Asp 
65 70 75 80 

Pro Asn Val Ser Phe Xaa Asn Lys Gly Ser Leu Val Glu Gly Gly He 
85 90 95 

Gly Gly Xaa Glu Gly Xaa Xaa Arg Arg Xaa Gin 

100 105 

) INFORMATION FOR SEQ ID NO: 35: 
(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 125 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

Val Leu Ser Val Pro Val Gly Asp Gly Phe Trp Xaa Arg Val Val Asn 
^5 10 15 

Pro Leu Gly Gin Pro He Asp Gly Arg Gly Asp Val Asp Ser Asp Thr 
20 25 30 

Arg Arg Ala Leu Glu Leu Gin Ala Pro Ser Val Val Xaa Arg Gin Gly 
35 40 45 

Val Lys Glu Pro Leu Xaa Thr Gly He Lys Ala He Asp Ala Met Thr 
50 55 60 

Pro He Gly Arg Gly Gin Arg Gin Leu He He Gly Asp Arg Lys Thr 



wo 99/42118 



58 



PCT/US99/03265 



SB 70 

Gly Lys Asn Arg Arg Leu Cys Arg 
85 

Glu Leu Gly Val Arg Trp He Pro 
100 

Val Gly His Arg Ala Arg Arg Gly 

lis 120 

(2) INFORMATION FOR SEQ ID NO: 36: 



75 80 

Thr Pro Ser Ser Asn Gin Arg Glu 
90 95 

Arg Ser Arg Cys Ala Cys Val Tyr 
105 110 

Thr Tyr His Arg Arg 
125 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 117 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:86: 

Cys Asp Ala Val Met Gly Phe Leu Gly Gly Ala Gly Pro Leu Ala Val 
'5 10 15 

Val Asp Gin Gin Leu Val Thr Arg Val Pro Gin Gly Trp Ser Phe Ala 
20 25 



30 



Gin Ala Ala Ala Val Pro Val Val Phe Leu Thr Ala Trp Tyr Gly Leu 
35 40 45 

Ala Asp Leu Ala Glu He Lys Ala Gly Glu Ser Val Leu lie His Ala 

55 60 

Gly Thr Gly Gly Val Gly Met Ala Ala Val Gin Leu Ala Arg Gin -rp 
70 75 ao 

Gly Val Glu Val Phe Val Thr Ala Ser Arg Gly Lys Trp Asp Thr Leu 
95 90 95 

Arg Ala Xaa Xaa Phe Asp Asp Xaa Pro Tyr Arg Xaa Phe Pro His Xaa 
100 105 



110 



Arg Ser Ser Xaa Gly 

115 



(2) INFORMATION FOR SEQ ID NO: 87: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 amino acids 

(B) TYPE: amino acid 

iC) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
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Met Tyr Arg Phe Ala Cys Arg Thr Leu Met Leu Ala Ala Cys lie Leu 

1 5 



10 



15 



Ala Thr Gly Val Ala Gly Leu Gly Val Gly Ala Gin Ser Ala Ala Gin 
20 25 30 

Thr Ala Pro Val Pro Asp Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp 



35 



40 



45 



Pro Ala Trp Gly Pro Asn Trp Asp Pro Tyr Thr Cys His Asp Asp Phe 



SO 



55 



60 



His Arg Asp Ser Asp Gly Pro Asp His Ser Arg Asp Tyr Pro Gly Pro 



65 



70 



75 



80 



He Leu Glu Gly Pro Val Leu Asp Asp Pro Gly Ala Ala Pro Pro Pro 
35 90 95 

Pro Ala Ala Gly Gly Gly Ala 
100 



(2) IMFORMATION FOR SEQ ID NO: 88: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LEKGTH: 88 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

Val Gin Cys .^g Val Trp Leu Glu He Gin Trp Arg Gly Met Leu Gly 
* ^ 10 15 

Ala Asp Gin Ala .^g Ala Gly Gly Pro Ala Arg He Trp Arg Glu His 
20 25 30 



Ser Met Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala 

40 45 

Thr Lys Glu Gly Arg Gly He Val Met Arg Val Pro Leu Glu Glv Gly 
5^ 55 60 

Gly Arg Leu Val Val Glu Leu Thr Pro Asp Glu Ala Ala Ala Leu Gly 
70 75 go 

Asp Glu Leu Lys Gly Val Thr Ser 
35 



(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 95 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly Asn Phe Glu Arg He 
15 10 15 

Ser Gly Asp Leu Lys Thr Gin He Asp Gin Val Glu Ser Thr Ala Gly 
20 25 30 

Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin Ala 
35 40 45 

Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu Leu 
50 55 60 

Asp Glu He Ser Thr Asn He Arg Gin Ala Gly Val Gin Tyr Ser Arg 

70 75 80 

Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
85 90 95 

(2) INFORMATION FOR SEQ ID NO : 90 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 166 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DS, 
Met Thr Gin Ser 



Arg Ala Asn Glu 

20 

Pro He Thr Pro 
35 

Xaa Val Leu Ser 
50 

Lys Glu Arg Gin 
65 

Tyr Gly Glu Val 



Glu Gly Thr Val 
100 

Ser Ala Glu Leu 

115 



ICRIPTION: SEQ I] 

Gin Thr Val Thr 

S 

Val Glu Ala Pro 



Cys Glu Leu Thr 
40 

Ala Asp Asn Met 
55 

Arg Leu Ala Thr 

70 

Asp Glu Glu Ala 
85 

Gin Ala Glu Ser 



Thr Asp Thr Pro 
120 



> NO: 90: 

Val Asp Gin Gin 

10 

Met Ala Asp Pro 
25 

Xaa Xaa Lys Asn 



Arg Glu Tyr Leu 
60 

Ser Leu Arg Asn 

75 

Ala Thr Ala Leu 
90 

Ala Gly Ala Val 
105 

Arg Val Ala Thr 



Glu He Leu Asn 
15 

Pro Thr Asp Val 
30 

Ala Ala Gin Gin 
45 

Ala Ala Gly Ala 



Ala Ala Lys Xaa 
80 

Asp Asn Asp Gly 
95 

Gly Gly Asp Ser 
110 

Ala Gly Glu Pro 
125 
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Asn Phe Met Asp Leu Lys Glu Ala Ala Arg Lys Leu Glu Thr Gly Asp 
130 135 

Gin Gly Ala Ser Leu Ala His Xaa Gly Asp Gly Trp Asn Thr Xaa Thr 
150 155 160 

Leu Thr Leu Gin Gly Asp 
155 

[2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Arg Ala Glu Arg Met 

1 5 

) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 263 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
( D } TOPOLOGY : 1 ine ar 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO : 92 : 

Val Ala Trp Met Ser Val Thr Ala Gly Gin Ala Glu Leu Thr Ala Ala 

^ S 10 .5 

Gin Val Arg Val Ala Ala Ala Ala Tyr Glu Thr Ala Tyr Gly Leu Thr 
20 25 30 

Val Pro Pro Pro Val He Ala Glu Asn Arg Ala Glu Leu Met He Leu 
^5 40 45 

He Ala Thr Asn Leu Leu Gly Gin Asn Thr Pro Ala He Ala Val Asn 
50 55 60 

Glu Ala Glu Tyr Gly Glu Met Trp Ala Gin Asp Ala Ala Ala Met Phe 
^5 70 75 80 

Gly Tyr Ala Ala Ala Thr Ala Thr Ala Thr Ala Thr Leu Leu Pro Phe 
35 90 95 

Glu Glu Ala Pro Glu Met Thr Ser Ala Gly Gly Leu Leu Glu Gin Ala 
100 105 110 

Ala Ala Val Glu 31u Ala Ser Asp Thr Ala Ala Ala Asn Gin Leu Met 
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115 120 125 

Asn Asn Val Pro Gin Ala Leu Lys Gin Leu Ala Gin Pro Thr Gin Gly 
130 135 

Thr Thr Pro Ser Ser Lys Leu Gly Gly Leu Trp Lys Thr Val Ser Pro 
150 155 160 

His Arg Ser Pro He Ser Asn Met Val Ser Met Ala Asn Asn His Met 
165 170 175 

Ser Met Thr Asn Ser Gly Val Ser Met Thr Asn Thr Leu Ser Ser Met 
lao 185 190 

Leu Lys Gly Phe Ala Pro Ala Ala Ala Ala Gin Ala Val Gin Thr Ala 
195 200 205 

Ala Gin Asn Gly Val Arg Ala Met Ser Ser Leu Gly Ser Ser Leu Gly 
210 215 220 

Ser Ser Gly Leu Gly Gly Gly Val Ala Ala Asn Leu Gly Arg Ala Ala 
225 230 235 240 

Ser Val Arg Tyr Gly His Arg Asp Gly Gly Lys Tyr Ala Xaa Ser Gly 
245 250 255 

Arg Arg Asn Gly Gly Pro Ala 
260 

) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 303 amino acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

Met Thr Tyr Ser Pro Gly Asn Pro Gly Tyr Pro Gin Ala Gin Pro Ala 
^ S 10 15 

Gly Ser Tyr Gly Gly Val Thr Pro Ser Phe Ala His Ala Asp Glu Gly 
20 25 30 

Ala Ser Lys Leu Pro Met Tyr Leu Asn lie Ala Val Ala Val Leu Gly 
35 40 45 

Leu Ala Ala Tyr Phe Ala Ser Phe Gly Pro Met Phe Thr Leu Ser Thr 

55 60 

Glu Leu Gly Gly Gly Asp Gly Ala Val Ser Gly Asp Thr Gly Leu Pro 
70 75 ao 

Val Gly Val Ala Leu Leu Ala Ala Leu Leu Ala Gly Val Val Leu Val 
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as 



90 



95 



Pro Lys Ala Lys Ser His Val Thr 

100 



Val Val Ala Val Leu Gly Val Leu 
105 no 



Gly Val Pfae Leu Met Val Ser Ala 
lis 120 



Thr Phe Asn Lys Pro Ser Ala lyr 
125 



Ser Thr Gly Trp Ala Leu Trp Val Val Leu Ala Phe He Val Phe Gin 
130 135 

Ala Val Ala Ala Val Leu Ala Leu Leu Val Glu Thr Gly Ala He Thr 
145 150 155 160 

Ala Pro Ala Pro Arg Pro Lys Phe Asp Pro Tyr Gly Gin Tyr Gly Arg 
1S5 170 175 

Tyr Gly Gin Tyr Gly Gin Tyr Gly Val Gin Pro Gly Gly Tyr Tyr Gly 
180 185 190 

Gin Gin Gly Ala Gin Gin Ala Ala Gly Leu Gin Ser Pro Gly Pro Gin 
195 200 205 

Gin Ser Pro Gin Pro Pro Gly Tyr Gly Ser Gin Tyr Gly Gly Tyr Ser 
210 215 220 

Ser Ser Pro Ser Gin Ser Gly Ser Gly Tyr Thr Ala Gin Pro Pro Ala 
225 230 235 240 

Gin Pro Pro Ala Gin Ser Gly Ser Gin Gin Ser His Gin Gly Pro Ser 
245 250 255 

Thr Pro Pro Thr Gly Phe Pro Ser Phe Ser Pro Pro Pro Pro Val Ser 
260 265 270 

Ala Gly Thr Gly Ser Gin Ala Gly Ser Ala Pro Val Asn Tvr Ser Asn 
275 280 285 

Pro Ser Gly Gly Glu Gin Ser Ser Ser Pro Gly Gly Ala Pro Val 
290 295 300 

(2) INFORMATION FOR 3EQ ID NO : 94 : 

(i) SEQUENC3 CHARACTERISTICS: 

(A) LENGTH: 507 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 94 : 
ATGAAGATGG TGAAATCGAT CGCCGCAGGT CTGACCGCCG CGGCTGCAAT CGGCGCCGCT 6 0 

GCGGCCGGTG TGACTTCGAT CATGGCTGGC GGCCCGGTCG TATACCAGAT GCAGCCGGTC 12 0 
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GTCTTCGGCG CGCCACTGCC GTTGGACCCG GCATCCGCCC CTGACGTCCC GACCGCCGCC 180 

CAGTTGACCA GCCTGCTCAA CAGCCTCGCC GATCCCAACG TQTCGTTTGC GAACAAGGGC 240 

AGTCTGGTCG AGGGCGGCAT CGGGGGCACC GAGGCGCGCA TCGCCGACCA CAAGCTGAAG 300 

AAGGCCGCCG AGCACGGGGA TCTGCCGCTG TCGTTCAGCG TGACGAACAT CCAGCCGGCG 360 

GCCGCCGGTT CGGCCACCGC CGACGTTTCC GTCTCGGGTC CGAAGCTCTC GTCGCCGGTC 420 

ACGCAGAACG TCACGTTCGT GAATCAAGGC GGCTGGATGC TGTCACGCGC ATCGGCGATG 480 
GAGTTGCTGC AGGCCGCAGG GAACTGA 



(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 168 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Met Lys Met Val Lys Ser He Ala Ala Gly Leu Thr Ala Ala Ala Ala 
^5 10 15 

He Gly Ala Ala Ala Ala Gly Val Thr Ser He Met Ala Gly Gly Pro 
20 25 30 

Val Val Tyr Gin Met Gin Pro Val Val Phe Gly Ala Pro Leu Pro Leu 
35 40 45 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Leu Thr Ser 
5° 55 60 

Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala Asn Lys Gly 
70 75 30 

Ser Leu Va^ Glu Gly Gly He Gly Gly Thr Glu Ala Arg He Ala Asp 
85 90 95 

His Lys Leu Lys Lys Ala Ala Glu His Gly Asp Leu Pro Leu Ser Phe 

105 110 

Ser Val Thr Asn He Gin Pro Ala Ala Ala Gly Ser Ala Thr Ala Asp 

120 125 

Val Ser Val Ser Gly Pro Lys Leu Ser Ser Pro Val Thr Gin Asn Val 
^^30 

Thr Phe Val Asn Gin Gly Gly Trp Met Leu Ser Arg Ala Ser Ala Met 

^-50 155 160 



507 



wo 99/42118 



65 



PCT/US99/03265 



120 
180 



Glu Leu Leu Gin Ala Ala Gly Asn 
165 

(2) IIXFORMATION FOR SEQ ID NO : 96 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 500 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingl e 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 96 : 
CGTGGCAATG TCGTTGACCG TCGGGGCCGG GGTCGCCTCC GCAGATCCCG TGGACGCGGT 6 0 

CATTAACACC ACCTGCAATT ACGGGCAGGT AGTAGCTGCG CTCAACGCGA CGGATCCGGG 
GGCTGCCGCA CAGTTCAACG CCTCACCGGT GGCGCAGTCC TATTTGCGCA ATTTCCTCGC 
CGCACCGCCA CCTCAGCGCG CTGCCATGGC CGCGCAATTG CAAGCTGTGC CGGGGGCGGC 240 
ACAGTACATC GGCCTTGTCG AGTCGGTTGC CGGCTCCTGC AACAACTATT AAGCCCATGC 3 00 

GGGCCCCATC CCGCGACCCG GCATCGTCGC CGGGGCTAGG CCAGATTGCC CCGCTCCTCA 3 60 

ACGGGCCGCA TCCCGCGACC CGGCATCGTC GCCGGGGCTA GGCCAGATTG CCCCGCTCCT 420 
CAACGGGCCG CATCTCGTGC CGAATTCCTG CAGCCCGGGG GATCCACTAG TTCTAGAGCG 480 
GCCGCCACCG CGGTGGAGCT 
(2) INFORMATION FOR SEQ ID NO : 97 : 

li) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 ammo acids 
iB) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 97 : 

Val Ala Met Ser Leu Thr Val Gly Ala Gly Val Ala 3er Ala Asp Pro 
5 10 15 

Val Asp Ala Val He Asn Thr Thr Cys Asn Tyr Gly Gin Val Val Ala 
20 25 30 

Ala Leu Asn Ala Thr Asp Pro Gly Ala Ala Ala Gin Phe Asn Ala Ser 

40 45 

Pro val Ala Gin Ser Tyr Leu Arg Asn Phe Leu Ala Ala Pro Pro Pro 

55 60 

Gin .\rg Ala Ala Met Ala Ala Gin Leu Gin Ala Val Pro Glv Ala Ala 
^'^ '^0 75 80 



500 
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Gin Tyr lie Gly Leu Val Glu Ser Val Ala Gly Ser Cys Asn Asn Tyr 
85 90 95 

(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 154 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 
ATGACAGAGC AGCAGTGGAA TTTCGCGGGT ATCGAGGCCG CGGCAAGCGC AATCCAGGGA 60 
AATGTCACGT CCATTCATTC CCTCCTTGAC GAGGGGAAGC AGTCCCTGAC CAAGCTCGCA 120 
GCGGCCTGGG GCGGTAGCGG TTCGGAAGCG TACC 154 
(2). INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

Met Thr Glu Gin Gin Trp Asn Phe Ala Gly He Glu Ala Ala Ala Ser 

^ ^ 10 15 

Ala He Gin Gly Asn Val Thr Ser He His Ser Leu Leu Asp Glu Gly 
20 25 30 

Lys Gin Ser Leu Thr Lys Leu Ala Ala Ala Trp Gly Gly Ser Gly Ser 
35 40 45 

Glu Ala Tyr 
50 

(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 282 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 
CGGTCGCGCA CTTCCAGGTG ACTATGAAAG TCGGCTTCCG NCTGGAGGAT TCCTGAACCT 60 



TCAAGCGCGG CCGATAACTG AGGTGCATCA TTAAGCGACT TTTCCAGAAC ATCCTGACGC 



120 



wo 99/42 H 8 



67 



PCT/US99/03265 



GCTCGAAACG CGGCACAGCC GACGGTGGCT CCGNCGAGGC GCTGNCTCCA AAATCCCTGA 180 

GACAATTCGN CGGGGGCGCC TACAAGGAAG TCGGTGCTGA ATTCGNCGNG TATCTGGTCG 240 

ACCTGTGTGG TCTGNAGCCG GACGAAGCGG TGCTCGACGT CG 282 
(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

GATCGTACCC GTGCGAGTGC TCGGGCCGTT TGAGGATGGA GTGCACGTGT CTTTCGTGAT 60 

GGCATACCCA GAGATGTTGG CGGCGGCGGC TGACACCCTG CAGAGCATCG GTGCTACCAC 120 

TGTGGCTAGC AATGCCGCTG CGGCGGCCCC GACGACTGGG GTGGTGCCCC CCGCTGCCGA 180 

TGAGGTGTCG GCGCTGACTG CGGCGCACTT CGCCGCACAT GCGGCGATGT ATCAGTCCGT 240 

GAGCGCTCGG GCTGCTGCGA TTCATGACCA GTTCGTGGCC ACCCTTGCCA GCAGCGCCAG 3 00 

CTCGTATGCG GCCACTGAAG TCGCCAATGC GGCGGCGGCC AGCTAAGCCA GGAACAGTCG 3 60 

GCACGAGAAA CCACGAGAAA TAGGGACACG TAATGGTGGA TTTCGGGGCG TTACCACCGG 42 0 

AGATCAACTC CGCGAGGATG TACGCCGGCC CGGGTTCGGC CTCGCTGGTG GCCGCGGCTC 4 80 

AGATGTGGGA CAGCGTGGCG AGTGACCTGT TTTCGGCCGC GTCGGCGTTT CAGTCGGTGG 54 0 

TCTGGGGTCT GACGGTGGGG TCGTGGATAG GTTCGTCGGC GGGTCTGATG GTGGCGGCGG 6 00 

CCTCGCCGTA TGTGGCGTGG ATGAGCGTCA CCGCGGGGCA GGCCGAGCTG ACCGCCGCCC 660 

AGGTCCGGGT TGCTGCGGCG GCCTACGAGA CGGCGTATGG GCTGACGGTG CCCCCGCCGG 72 0 

TGATCGCCGA GAACCGTGCT GAACTGATGA TTCTGATAGC GACCAACCTC TTGGGGCAAA 780 

ACACCCCGGC GATCGCGGTC AACGAGGCCG AATACGGCGA GATGTGGGCC CAAGACGCCG 840 

CCGCGATGTT TGGCTACGCC GCGGCGACGG CGACGGCGAC GGCGACGTTG CTGCCGTTCG 900 

AGGAGGCGCC GGAGATGACC AGCGCGGGTG GGCTCCTCGA GCAGGCCGCC GCGGTCGAGG 960 

AGGCCTCCGA CACCGCCGCG GCGAACCAGT TGATGAACAA TGTGCCCCAG GCGCTGCAAC 102 0 

AGCTGGCCCA GCCCACGCAG GGCACCACGC CTTCTTCCAA GCTGGGTGGC CTGTGGAAGA 10 80 

CGGTCTCGCC GCATCGGTCG CCGATCAGCA ACATGGTGTC GATGGCCAAC AACCACATGT 114 0 
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CGATGACCAA CTCGGGTGTG TCGATGACCA ACACCTTGAG CTCGATGTTG AAGGGCTTTG 1200 

CTCCGGCGGC GGCCGCCCAG GCCGTGCAAA CCGCGGCGCA AAACGGGGTC CGGGCGATGA 1260 

GCTCGC7GGG CAGCTCGCTG GGTTCTTCGG GTCTGGGCGG TGGGGTGGCC GCCAACTTGG 132 0 

GTCGGGCGGC CTCGGTCGGT TCGTTGTCGG TGCCGCAGGC CTGGGCCGCG GCCAACCAGG 13 8 0 

CAGTCACCCC GGCGGCGCGG GCGCTGCCGC TGACCAGCCT GACCAGCGCC GCGGAAAGAG 144 0 

GGCCCGGGCA GATGCTGGGC GGGCTGCCGG TGGGGCAGAT GGGCGCCAGG GCCGGTGGTG 1500 

GGCTCAGTGG TGTGCTGCGT GTTCCGCCGC GACCCTATGT GATGCCGCAT TCTCCGGCGG 1560 

CCGGCTAGGA GAGGGGGCGC AGACTGTCGT TATTTGACCA GTGATCGGCG GTCTCGGTGT 1620 

TTCCGCGGCC GGCTATGACA ACAGTCAATG TGCATGACAA GTTACAGGTA TTAGGTCCAG 1680 

GTTCAACAAG GAGACAGGCA ACATGGCCTC ACGTTTTATG ACGGATCCGC ACGCGATGCG 174 0 

GGACATGGCG GGCCGTTTTG AGGTGCACGC CCAGACGGTG GAGGACGAGG CTCGCCGGAT 1800 

GTGGGCGTCC GCGCAAAACA TTTCCGGTGC GGGCTGGAGT GGCATGGCCG AGGCGACCTC 1860 

GCTAGACACC ATGGCCCAGA TGAATCAGGC GTTTCGCAAC ATCGTGAACA TGCTGCACGG 192 0 

GGTGCGTGAC GGGCTGGTTC GCGACGCCAA CAACTACGAG CAGCAAGAGC AGGCCTCCCA 1980 

GCAGATCCTC AGCAGCTAAC GTCAGCCGCT GCAGCACAAT ACTTTTACAA GCGAAGGAGA 204 0 

ACAGGTTCGA TGACCATCAA CTATCAATTC GGGGATGTCG ACGCTCACGG CGCCATGATC 210 0 

CGC3CTCAGG CCGGGTTGCT GGAGGCCGAG CATCAGGCCA TCATTCGTGA TGTGTTGACC 2160 

3CGAGTGACT TT7GGGGCGG CGCCGGTTCG GCGGCCTGCC AGGGGTTCAT TACCCAGTTG 2220 

GGCCGTAACT TCCAGGTGAT CTACGAGCAG GCCAACGCCC ACGGGCAGAA GGTGCAGGCT 22 80 

GCCGGCAACA ACATGGCGCA AACCGACAGC GCCGTCGGCT CCAGCTGGGC CTGACACCAG 23 4 0 

GCCAAGGCCA GGGACGTGGT GTACGAGTGA AGTTCCTCGC GTGATCCTTC GGGTGGCAGT 24 0 0 

CTAAGTGGTC AGTGCTGGGG TGTTGGTGGT TTGCTGCTTG GCGGGTTCTT CGGTGCTGGT 2460 

CAGTGCTGCT CGGGCTCGGG TGAGGACCTC GAGGCCCAGG TAGCGCCGTC CTTCGATCCA 2520 

TTCGTCGTGT TGTTCGGCGA GGACGGCTCC GACGAGGCGG ATGATCGAGG CGCGGTCGGG 25 80 

GAAGATGCCC ACGACGTCGG TTCGGCGTCG TACGTCTCGG TTGAGGCGTT CCTGGGGGTT 2 64 0 

GTTGGACCAG ATTTGGCGCC AGATCTGCTT GGGGAAGGCG GTGAACGCCA GCAGGTCGGT 2 7 00 

GCGGGCGGTG TCGAGGTGCT CGGCCACCGC GGGGAGTTTG TCGGTCAGAG CGTCGAGTAC 2 7 60 

CCGATCATAT TGGGCAACAA CTGATTCGGC GTCGGGCTGG TCGTAGATGG AGTGCAGCAG 2 820 
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GGTGCGCACC CACGGCCAGG AGGGCTTCGG GGTGGCTGCC ATCAGATTGG CTGCGTAGTG 2 8 80 

GGTTCTGCAG CGCTGCCAGG CCGCTGCGGG CAGGGTGGCG CCGATCGCGG CCACCAGGCC 294 0 

GGCGTGGGCG TCGCTGGTGA CCAGCGCGAC CCCGGACAGG CCGCGGGCGA CCAGGTCGCG 3 000 

GAAGAACGCC AGCCAGCCGG CCCCGTCCTC GGCGGAGGTG ACCTGGATGC CCAGGATC 3 05 8 
(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 391 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

Met Val Asp Phe Gly Ala Leu Pro Pro Glu He Asn Ser Ala Arg Met 
1 5 10 15 

Tyr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Gin Met Trp 
20 25 30 

Asp Ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gin Ser 
35 40 45 

Val Val Trp Gly Leu Thr Val Gly Ser Trp He Gly Ser Ser Ala Gly 
50 55 60 

Leu Met Val Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr 

55 70 75 30 

Ala Gly Gin Ala Glu Leu Thr Ala Ala Gin Val Arg Val Ala Ala Ala 
95 90 95 

Ala Tyr Glu Thr Ala Tyr Gly Leu Thr Val Pro Pro Pro Val He Ala 
100 105 110 

Glu Asn Arg Ala Glu Leu Met He Leu He Ala Thr Asn Leu Leu Gly 
115 120 125 

Gin Asn Thr Pro Ala He Ala Val Asn Glu Ala Glu Tyr Gly Glu Met 
130 135 140 

Trp Ala Gin Asp Ala Ala Ala Met Phe Gly Tyr Ala Ala Ala Thr Ala 
-45 150 155 160 

Thr Ala Thr Ala Thr Leu Leu Pro Phe Glu Glu Ala Pro Glu Met Thr 

165 170 175 

Ser Ala Gly Gly Leu Leu Glu Gin Ala Ala Ala Val Glu Glu Ala Ser 
130 185 190 
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Asp Thr Ala Ala Ala Aan Gin Leu Met Asn Asn Val Pro Gla Ala Leu 

200 205 

Gin Gin Leu Ala Gin Pro Thr Gin Gly Thr Thr Pro Ser Ser Lys Leu 

215 220 

Gly Gly Leu Trp Lys Thr Val Ser Pro His Arg Ser Pro He Ser Asn 
230 235 240 

Met Val Ser Met Ala Asn Asn His Met Ser Met Thr Asn Ser Gly Val 
2<S 250 255 

Ser Met Thr Asn Thr Leu Ser Ser Met Leu Lys Gly Phe Ala Pro Ala 
2^° 265 270 

Ala Ala Ala Gin Ala Val Gin Thr Ala Ala Gin Asn Gly Val Arg Ala 
275 280 285 

Met Ser Ser Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu Gly Glv Glv 

295 300 

Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser Leu Ser Val 
310 315 

Pro Gin Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro Ala Ala Arg 
325 330 335 

Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Glu Arg Gly Pro Gly 

345 350 

Gin Met Leu Gly Gly Leu Pro Val Gly Gin Met Gly Ala Hzg Ala Gly 

360 365 

Gly Gly Leu Ser Gly Val Leu Arg Val Pro Pro Arg Pro Tyr Val Met 
370 375 



380 



Pro His Ser Pro Ala Ala Gly 
385 390 

(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1725 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO:103: 

GACGTCAGCA CCCGCCGTGC AGGGCTGGAG CGTGGTCGGT TTTGATCTGC GGTCAAGGTG 60 

ACGTCCCTCG GCGTGTCGCC GGCGTGGATG CAGACTCGAT GCCGCTCTTT AGTGCAACTA 120 

ATTTCGTTGA AGTGCCTGCG AGGTATAGGA CTTCACGATT GGTTAATGTA GCGTTCACCC 18 q 
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CGTGTTGGGG TCGATTTGGC CGGACCAGTC GTCACCAACG CrtTGGCGTGC GCGCCAGGCG 
GGCGATCAGA TCGCTTGACT ACCAATCAAT CTTGAGCTCC CGGGCCGATG CTCGGGCTAA 
ATGAGGAGGA GCACGCGTGT CTTTCACTGC GCAACCGGAG ATGTTGGCGG CCGCGGCTCG 
CGAACTTCGT TCCCTGGGGG CAACGCTGAA GGCTAGCAAT GCCGCCGCAG CCGTGCCGAC 
GACTGGGGTG GTGCCCCCGG CTGCCGACGA GGTGTCGCTG CTGCTTGCCA CACAATTCCG 
TACGCATGCG GCGACGTATC AGACGGCCAG CGCCAAGGCC GCGGTGATCC ATGAGCAGTT 
TGTGACCACG CTGGCCACCA GCGCTAGTTC ATATGCGGAC ACCGAGGCCG CCAACGCTCT 
GGTCACCGGC TAGCTGACCT GACGGTATTC GAGCGGAAGG AriATCGAAG TGGTGGAriT 
CGGGGCGTTA CCACCGGAGA TCAACTCCGC GAGGATGTAC GCCGGCCCGG GTrCGGCCTC 
GCTGGTGGCC GCCGCGAAGA TGTGGGACAG CGTGGCGAGT GACCTGTTTT CGGCCGCGTC 
GGCGTTTCAG TCGGTGGTCT GGGGTCTGAC GGTGGGGTCG TGGATAGGTT CGTCGGCGGG 
TCTGATGGCG GCGGCGGCCT CGCCGTATGT GGCGTGGATG AGCGTCACCG CGGGGCAGGC 
CCAGCTGACC GCCGCCCAGG TCCGGGTTGC TGCGGCGGCC TACGAGACAG CGTATAGGCT 
GACGGTGCCG CCGCCGGTGA TCGCCGAGAA CCGTACCGAA CTGATGACGC TGACCGCGAC 
CAACCTCTTG GGGCAAAACA CGCCGGCGAT CGAGGCCAAT CAGGCCGCAT ACAGCCAGAT 
GTGGGGCCAA GACGCGGAGG CGATGTATGG CTACGCCGCC ACGGCGGCGA CGGCGACCGA 
GGCGTTGCTG CCGTTCGAGG ACGCCCCACT GATCACCAAC CCCGGCGGGC TCCTTGAGCA 
GGCCGTCGC3 GTCGAGGAGG CCATCGACAC CGCCGCGGCG AACCAGTTGA TGAACAATGT 
GCCCCAAGCG CTGCAACAGC TGGCCCAGCC AGCGCAGGGC GTCGTACCTT CTTCCAAGCT 
GGGTGGGCTG TGGACGGCGG TCTCGCCGCA TCTGTCGCCG CTCAGCAACG TCAGTTCGAT 
AGCCAACAAC CACATGTCGA TGATGGGCAG GGGTGTGTCG ATGACCAACA CCTTTGCACTC 
GATGTTGAAG GGCTTAGCTC CGGCGGCGGC TCAGGCCGTG GAAACCGCGG CGGAAAACGG 
GGTCTGGGCG ATGAGCTCGC TGGGCAGCGA GCTGGGTTCG TCGCTGGGTT CTTCGGGTCT 
GGGCGCTGGG GTGGCCGCCA ACTTGGGTCG GGCGGCCTCG GTCGGTTCGT TGTCGGTGCC 
GCCAGCATGG GCCGCGGCCA ACCAGGCGGT CACCCCGGCG GCGCGGGCGC TGCCGCTGAC 
CAGCCTGACC AGCGCCGCCC AAACCGCCCC CGGACACATG CTGGG 
(2) INFORMATION FOR SEQ ID NO: 104: 
'1) SEQUENCH CHARACTERIST:CS : 



240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1725 
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(A) LENGTH: 35 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Val Val Asp Phe Gly Ala Leu Pro Pro Glu He Asn Ser Ala Arg Met 
^5 10 15 

Tyr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Lys Met Trp 
20 25 30 

Asp Ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gin Ser 
35 40 45 

Val Val Trp Gly Leu Thr Val Gly Ser Trp He Gly Ser Ser Ua Gly 
50 55 60 

Leu Met Ala Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr 
^5 70 75 80 

Ala Gly Gin Ala Gin Leu Thr Ala Ala Gin Val Arg Val Ala Ala Ala 
85 90 95 

Ala Tyr Glu Thr Ala Tyr Arg Leu Thr Val Pro Pro Pro Val lie Ala 
100 105 110 

Glu Asn Arg Thr Glu Leu Met Thr Leu Thr Ala Thr Asn Leu Leu Gly 
115 120 125 

Gin Asn Thr Pro Ala He Glu Ala Asn Gin Ala Ala Tyr Ser Gin Met 
1^0 135 140 

Trp Gly Gin Asp Ala Glu Ala Met Tyr Gly Tyr Ala Ala Thr Ala Ala 
150 155 160 

Thr Ala Thr Glu Ala Leu Leu Pro Phe Glu Asp Ala Pro Leu He Thr 
165 170 175 

Asn Pro Gly Gly Leu Leu Glu Gin Ala Val Ala Val Glu Glu Ala He 
180 185 190 

Asp Thr Ala Ala Ala Asn Gin Leu Met Asn Asn Val Pro Gin Ala Leu 
195 200 205 

Gin Gin Leu Ala Gin Pro Ala Gin Gly Val Val Pro Ser Ser Lys Leu 
210 215 220 

Gly Gly Leu Trp Thr Ala Val Ser Pro His Leu Ser Pro Leu Ser Asn 
225 230 235 240 

Val Ser Ser He Ala Asn Asn His Met Ser Met Met Gly Thr Gly Val 
245 250 255 
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Ser Met Thr Asn Tkr Leu His Ser Met Leu Lys Gly Leu Ala Pro Ala 
260 265 270 

Ala Ala Gin Ala Val Glu Thr Ala Ala Glu Asn Gly Val Trp Ala Met 
275 280 285 

Ser Ser Leu Gly Ser Gin Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu 
290 295 300 

Gly Ala Gly Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser 
305 310 315 320 

Leu Ser Val Pro Pro Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro 
325 330 335 

Ala Ala Arg Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Gin Thr 
340 345 350 

Ala Pro Gly His Met Leu Gly 
355 

(2) INFORMATION FOR SEQ ID NO : 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3027 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:105: 

AGTTCAGTCG AGAATGATAC TGACGGGCTG TATCCACGAT GGCTGAGACA ACCGAACCAC 60 

C3TCGGAC3C GGGGACATCG CAAGCCGACG CGATGGCGTT GGCCGCCGAA GCCGAAGCCG 120 

CCGAAGCCGA AGCGCTGGCC GCCGCGGCGC GGGCCCGTGC CCGTGCCGCC CGGTTGAAGC 180 

GTGAGGCGCT GGCGATGGCC CCAGCCGAGG ACGAGAACGT CCCCGAGGAT ATGCAGACTG 240 

GGAAGACGCC GAAGACTATG ACGACTATGA CGACTATGAG GCCGCAGACC AGGAGGCCGC 3 00 

ACGGTCGGCA TCCTGGCGAC GGCGGTTGCG GGTGCGGTTA CCAAGACTGT CCACGATTGC 360 

CATGGCGGCC GCAGTCGTCA TCATCTGCGG CTTCACCGGG CTCAGCGGAT ACATTGTGTG 420 

GCAACACCAT GAGGCCACCG AACGCCAGCA GCGCGCCGCG GCGTTCGCCG CCGGAGCCAA 4 80 

GCAAGGTGTC ATCAACATGA CCTCGCTGGA CTTCAACAAG GCCAAAGAAG ACGTCGCGCG 540 

TGTGATCGAC AGCTCCACCG GCGAATTCAG GGATGACTTC CAGCAGCGGG CAGCCGATTT 600 

CACCAAGGTT GTCGAACAGT CCAAAGTGGT CACCGAAGGC ACGGTGAACG CGACAGCCGT 660 

CGAATCCATG AACGAGCATT CCGCCGTGGT GCTCGTCGCG GCGACTTCAC GGGTCACCAA 720 
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TTCCGCTGGG GCGAAAGACG AACCACGTGC GTGGCGGCTC AAAGTGACCG TGACCGAAGA 780 

GGGGGGACAG TACAAGATGT CGAAAGTTGA GTTCGTACCG TGACCGATGA CGTACGCGAC 84 0 

GTCAACACCG AAACCACTGA CGCCACCGAA GTCGCTGAGA TCGACTCAGC CGCAGGCGAA 900 

GCCGGTGATT CGGCGACCGA GGCATTTGAC ACCGACTCTG CAACGGAATC TACCGCGCAG 960 

AAGGGTCAGC GGCACCGTGA CCTGTGGCGA ATGCAGGTTA CCTTGAAACC CGTTCCGGTG 1020 

ATTCTCATCC TGCTCATGTT GATCTCTGGG GGCGCGACGG GATGGCTATA CCTTGAGCAA 1080 

TACGACCCGA TCAGCAGACG GACTCCGGCG CCGCCCGTGC TGCCGTCGCC GCGGCGTCTG 114 0 

ACGGGACAAT CGCGCTGTTG TGTATTCACC CGACACGTCG ACCAAGACTT CGCTACCGCC 120 0 

AGGTCGCACC TCGCCGGCGA TTTCCTGTCC TATACGACCA GTTCACGCAG CAGATCGTGG 12 60 

CTCCGGCGGC CAAACAGAAG TCACTGAAAA CCACCGCCAA GGTGGTGCGC GCGGCCGTGT 132 0 

CGGAGCTACA TCCGGATTCG GCCGTCGTTC TGGTTTTTGT CGACCAGAGC ACTACCAGTA 13 8 0 

AGGACAGCCC CAATCCGTCG ATGGCGGCCA GCAGCGTGAT GGTGACCCTA GCCAAGGTCG 144 0 

ACGGCAATTG GCTGATCACC AAGTTCACCC CGGTTTAGGT TGCCGTAGGC GGTCGCCAAG 1500 

TCTGACGGGG GCGCGGGTGG CTGCTCGTGC GAGATACCGG CCGTTCTCCG GACAATCACG 156 0 

GCCCGACCTC AAACAGATCT CGGCCGCTGT CTAATCGGCC GGGTTATTTA AGATTAGTTG 162 0 

CCACTGTATT TACCTGATGT TCAGATTGTT CAGCTGGATT TAGCTTCGCG GCAGGGCGGC 1680 

TGGTGCACTT TGCATCTGGG GTTGTGACTA CTTGAGAGAA TTTGACCTGT TGCCGACGTT 174 0 

GTTTGCTGTC CATCATTGGT GCTAGTTATG GCCGAGCGGA AGGATTATCG AAGTGGTGGA 18 00 

CTTC3GGGCG TTACCACCGG AGATCAACTC CGCGAGGATG TACGCCGGCC CGGGTTCGGC 18 6 0 

CTCGCTGGTG GCCGCCGCGA AGATGTGGGA GAGCGTGGCG AGTGACCTGT TTTCGGCCGC 192 0 

GTCGGCGTTT CAGTCGGTGG TCTGGGGTCT GACGACGGGA TCGTGGATAG GTTCGTCGGC 1980 

GGGTCTGATG GTGGCGGCGG CCTCGCCGTA TGTGGCGTGG ATGAGCGTCA CCGCGGGGCA 204 0 

GGCCGAGCTG ACCGCCGCCC AGGTCCGGGT TGCTGCGGCG GCCTACGAGA CGGCGTATGG 2100 

GCTGACGGTG CCCCCGCCGG TGATCGCCGA GAACCGTGCT GAACTGATGA TTCTGATAGC 2160 

GACCAACCTC TTGGGGCAAA ACACCCCGGC GATCGCGGTC AACGAGGCCG AATACGGGGA 22 2 0 

GATGTGGGCC CAAGACGCCG CCGCGATGTT TGGCTACGCC GCCACGGCGG CGACGGCGAC 2280 

CGAGGCGTTG CTGCCGTTCG AGGACGCCCC ACTGATCACC AACCCCGGCG GGCTCCTTGA 234 0 

GCAGGCCGTC GCGGTCGAGG AGGCCATCGA CACCGCCGCG GCGAACCAGT TGATGAACAA 24 0 0 
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TGTGCCCCAA GCGCTGCAAC AACTGGCCCA GCCCACGAAA AGCATCTGGC CGTTCGACCA 2460 

ACTGAGTGAA CTCTGGAAAG CCATCTCGCC GCATCTGTCG CCGCTCAGCA ACATCGTGTC 2520 

GATGCTCAAC AACCACGTGT CGATGACCAA CTCGGGTGTG TCGATGGCCA GCACCTTGCA 25 80 

CTCAATGTTG AAGGGCTTTG CTCCGGCGGC GGCTCAGGCC GTGGAAACCG CGGCGCAAAA 264 0 

CGGGGTCCAG GCGATGAGCT CGCTGGGCAG CCAGCTGGGT TCGTCGCTGG GTTCTTCGGG 2700 

TCTGGGCGCT GGGGTGGCCG CCAACTTGGG TCGGGCGGCC TCGGTCGGTT CGTTGTCGGT 2760 

GCCGCAGGCC TGGGCCGCGG CCAACCAGGC GGTCACCCCG GCGGCGCGGG CGCTGCCGCT 2 820 

GACCAGCCTG ACCAGCGCCG CCCAAACCGC CCCCGGACAC ATGCTGGGCG GGCTACCGCT 2 880 

GGGGCAACTG ACCAATAGCG GCGGCGGGTT CGGCGGGGTT AGCAATGCGT TGCGGATGCC 2940 

GCCGCGGGCG TACGTAA7GC CCCGTGTGCC CGCCGCCGGG TAACGCCGAT CCGCACGCAA 3 000 

TGCGGGCCCT CTATGCGGGC AGCGATC 3 027 
(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 96 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

Val Val Asp ?he Gly Ala l,eu Pro Pro Glu He Asn Ser Ala Arg Met 
- 5 10 15 

Tyr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Lys Met Trp 
20 25 30 

Asp Ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gin Ser 
35 40 45 

Val Val Trp Gly Leu Thr Thr Gly Ser Trp He Gly Ser Ser Ala Glv 
50 55 60 

Leu Met Val Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr 
70 75 80 

Ala Gly Gin Ala Glu Leu Thr Ala Ala Gin Val Arg Val Ala Ala Ala 
85 90 95 

Ala Tyr Glu Thr Ala Tyr Gly Leu Thr Val Pro Pro Pro Val He Ala 
100 105 110 

Glu Asn Arg Ala Glu Leu Met lie Leu He Ala Thr Asn Leu Leu Gly 
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115 



12 0 



125 



Gin Asn Thr Pro Ala lie Ala Val Asn Glu Ala Glu Tyr Gly Glu Met 
130 135 140 

Trp Ala Gin Asp Ala Ala Ala Met Phe Gly Tyr Ala Ala Thr Ala Ala 
145 150 155 160 

Thr Ala Thr Glu Ala Leu Leu Pro Phe Glu Asp Ala Pro Leu lie Thr 

165 170 175 

Asn Pro Gly Gly Leu Leu Glu Gin Ala Val Ala Val Glu Glu Ala lie 
180 las 190 

Asp Thr Ala Ala Ala Asn Gin Leu Met Asn Asn Val Pro Gin Ala Leu 
195 200 205 

Gin Gin Leu Ala Gin Pro Thr Lys Ser He Trp Pro Phe Asp Gin Leu 
210 215 220 

Ser Glu Leu Trp Lys Ala He Ser Pro His Leu Ser Pro Leu Ser Asn 
225 230 235 240 

He Val Ser Met Leu Asn Asn His Val Ser Met Thr Asn Ser Gly Val 
245 250 255 

Ser Met Ala Ser Thr Leu His Ser Met Leu Lys Gly Phe Ala Pro Ala 
260 265 270 

Ala Ala Gin Ala Val Glu Thr Ala Ala Gin Asn Gly Val Gin Ala Met 
275 280 285 

Ser Ser Leu Gly Ser Gin Leu Gly Ser Ser Leu Gly Ser Ser Glv Leu 
290 295 300 

Gly Ala Gly Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser 
305 310 315 320 

Leu Ser Val Pro Gin Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro 
325 330 335 

Ala Ala Arg Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Gin Thr 
340 345 

Ala Pro Gly His Met Leu Gly Gly Leu Pro Leu Gly Gin Leu Thr Asn 
355 360 365 

Ser Gly Gly Gly Phe Gly Gly Val Ser Asn Ala Leu Arg Met Pro Pro 
370 375 380 



Arg Ala Tyr Val Met Pro Arg Val Pro Ala Ala Gly 
385 390 395 



INFORMATION FOR SEQ ID NO: 107 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1616 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

CATCGGAGGG AGTGATCACC ATGCTGTGGC ACGCAATGCC ACCGGAGTAA ATACCGCACG 60 

GCTGATGGCC GGCGCGGGTC CGGCTCCAAT GCTTGCGGCG GCCGCGGGAT GGCAGACGCT 120 

TTCGGCGGCT CTGGACGCTC AGGCCGTCGA GTTGACCGCG CGCCTGAACT CTCTGGGAGA 180 

AGCCTGGACT GGAGGTGGCA GCGACAAGGC GCTTGCGGCT GCAACGCCGA TGGTGGTCTG 24 0 

GCTACAAACC 3CGTCAACAC AGGCCAAGAC CCGTGCGATG CAGGCGACGG CGCAAGCCGC 3 00 

GGCATACACC CAGGCCATGG CCACGACGCC GTCGCTGCCG GAGATCGCCG CCAACCACAT 360 

CACCCAGGCC GTCCTTACGG CCACCAACTT CTTCGGTATC AACACGATCC CGATCGCGTT 42 0 

GACCGAGATG GATTATTTCA TCCGTATGTG GAACCAGGCA GCCCTGGCAA TGGAGGTCTA 48 0 

CCAGGCCGAG ACCGCGGTTA ACACGCTTTT CGAGAAGCTC GAGCCGATGG CGTCGATCCT 54 0 

TGATCCCGGC GCGAGCCAGA GCACGACGAA CCCGATCTTC GGAATGCCCT CCCCTGGCAG 60 0 

CTCAACACCG GTTGGCCAGT TGCCGCCGGC GGCTACCCAG ACCCTCGGCC AACTGGGTGA 660 

GATGAGCGGC CCGATGCAGC AGCTGACCCA GCCGCTGCAG CAGGTGACGT CGTTGTTCAG 72 0 

CCAGGTGGGC GGCACCGGCG GCGGCAACCC AGCCGACGAG GAAGCCGCGC AGATGGGCCT 780 

GCTCGGCACC AGTCCGCTGT CGAACCATCC GCTGGCTGGT GGATCAGGCC CCAGCGCGGG 84 0 

CGCGGGCCTG CTGCGCGCGG AGTCGCTACC TGGCGCAGGT GGGTCGTTGA CCCGCACGCC 90 0 

GCTGATGTCT CAGCTGATCG .\AAAGCCGGT TGCCCCCTCG GTGATGCCGG CGGCTGCTGC 960 

CGGATCGTCG GCGACGGGTG GCGCCGCTCC GGTGGGTGCG GGAGCGATGG GCCAGGGTGC 102 0 

GCAATCCGGC GGCTCCACCA GGCCGGGTCT GGTCGCGCCG GCACCGCTCG CGCAGGAGCG 10 80 

TGAAGAAGAC GACGAGGACG ACTGGGACGA AGAGGACGAC TGGTGAGCTC CCGTAATGAC 114 0 

AACAGACTTC CCGGCCACCC GGGCCGGAAG ACTTGCCAAC ATTTTGGCGA GGAAGGTAAA 12 0 0 

GAGAGAAAGT AGTCCAGCAT GGCAGAGATG AAGACCGATG CCGCTACCCT CGCGCAGGAG 12 60 

GCAGGTAATT TCGAGCGGAT CTCCGGCGAC CTGAAAACCC AGATCGACCA GGTGGAGTCG 13 2 0 

ACGGCAGGTT CGTTGCAGGG CCAGTGGCGC GGCGCGGCGG GGACGGCCGC CCAGGCCGCG 13 8 0 

GTGGTGCGCT TCCAAGAAGC AGCCAATAAG CAGAAGCAGG AACTCGACGA GATCTCGACG 144 0 
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AATATTCGTC AGGCCGGCGT CCAATACTCG AGGGCCGACG AGGAGCAGCA GCAGGCGCTG 150 0 

TCCTCGCAAA TGGGCTTCTG ACCCGCTAAT ACGAAAAGAA ACGGAGCAAA AACATGACAG 1560 

AGCAGCAGTG GAATTTCGCG GGTATCGAGG CCGCGGCAAG CGCAATCCAG GGAAAT 1616 
{2} INFORMATION FOR SEQ ID NO; 10 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 432 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

CTAGTGGATG GGACCATGGC CATTTTCTGC AGTCTCACTG CCTTCTGTGT TGACATTTTG 6 0 

GCACGCCGGC GGAAACGAAG CACTGGGGTC GAAGAACGGC TGCGCTGCCA TATCGTCCGG 12 0 

AGCTTCCATA CCTTCGTGCG GCCGGAAGAG CTTGTCGTAG TCGGCCGCCA TGACAACCTC 18 0 

TCAGAGTGCG CTCAAACGTA TAAACACGAG AAAGGGCGAG ACCGACGGAA GGTCGAACTC 24 0 

GCCCGATCCC GTGTTTCGCT ATTCTACGCG AACTCGGCGT TGCCCTATGC GAACATCCCA 3 0C 

GTGACGTTGC CTTCGGTCGA AGCCATTGCC TGACCGGCTT CGCTGATCGT CCGCGCCAGG 360 

TTCTGCAGCG CGTTGTTCAG CTCGGTAGCC GTGGCGTCCC ATTTTTGCTG GACACCCTGG 420 

TACGCCTCCG AA 432 
[2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 68 ammo acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 9: 

Met Leu Trp His Ala Met Pro Pro Glu Xaa Asn Thr Ala Arg Leu Met 
15 10 IS 

Ala Gly Ala Gly Pro Ala Pro Met Leu Ala Ala Ala Ala Gly Trp Gin 
20 25 30 

Thr Leu Ser Ala Ala Leu Asp Ala Gin Ala Val Glu Leu Thr Ala Arg 
35 40 45 

Leu Asn Ser Leu Gly Glu Ala Trp Thr Gly Gly Gly Ser Asp Lys Ala 
50 55 60 
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Leu Ala Ala Ala Thr Pro Met Val Val Trp Leu Gin Tttr Ala Ser Thr 
65 70 75 80 

Gin Ala Lys Thr Arg Ala Met Gin Ala Thr Ala Gin Ala Ala Ala Tyr 
85 90 95 

Thr Gin Ala Met Ala Thr Thr Pro Ser Leu Pro Glu lie Ala Ala Asn 
100 105 110 

His lie Thr Gin Ala Val Leu Thr Ala Thr Asn Phe Phe Gly He Asn 
115 120 125 

Thr lie Pro He Ala Leu Thr Glu Met Asp Tyr Phe He Arg Met Trp 
130 135 140 

Asn Gin Ala Ala Leu Ala Met Glu Val Tyr Gin Ala Glu Thr Ala Val 
145 150 155 160 

Asn Thr Leu Phe Glu Lys Leu Glu Pro Met Ala Ser He Leu Asp Pro 
165 170 175 

Gly Ala Ser Gin Ser Thr Thr Asn Pro He Phe Gly Met Pro Ser Pro 
180 185 190 

Gly Ser Ser Thr Pro Val Gly Gin Leu Pro Pro Ala Ala Thr Gin Thr 
195 200 205 

Leu Gly Gin Leu Gly Glu Met Ser Gly Pro Met Gin Gin Leu Thr Gin 
210 215 220 

Pro Leu Gin Gin Val Thr Ser Leu Phe Ser Gin Val Gly Gly Thr Gly 
225 230 235 240 

Gly Gly Asn Pro Ala Asp Glu Glu Ala Ala Gin Met Gly Leu Leu Gly 
245 250 255 

Thr Ser Pro Leu Ser Asn Ris Pro Leu Ala Gly Gly Ser Gly Pro Ser 
260 265 270 

Ala Gly Ala Gly Leu Leu Arg Ala Glu Ser Leu Pro Gly Ala Gly Gly 
275 280 285 

Ser Leu Thr Arg Thr Pro Leu Met Ser Gin Leu He Glu Lys Pro Val 
290 295 300 

Ala Pro Ser Val Met Pro Ala Ala Ala Ala Gly Ser Ser Ala Thr Gly 
305 310 315 320 

Gly Ala Ala Pro Val Gly Ala Gly Ala Met Gly Gin Gly Ala Gin Ser 
325 330 335 

Gly Gly Ser Thr Arg Pro Gly Leu Val Ala Pro Ala Pro Leu Ala Gin 
340 345 350 



Glu Arg Glu Glu Asp Asp Glu Asp Asp Trp Asp Glu Glu Asp Asp Trp 
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355 360 365 

(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NOrllO: 

Met Ala Glu Met Lys Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly 
15 10 15 

Asn ?he Glu Arg lie Ser Gly Asp Leu Lys Thr Gin lie Asp Gin Val 
20 25 30 

Glu Ser Thr Ala Gly Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly 
35 40 45 

Thr Ala Ala Gin Ala Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys 

50 55 60 

Gin Lys Gin Glu Leu Asp Glu lie Ser Thr Asn He Arg Gin Ala Gly 
65 70 75 80 

Val Gin Tyr Ser Arg Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser 
85 90 95 

Gin Met Gly Phe 
100 

[2] INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

GATCTCCGGC GACCTGAAAA CCCAGATCGA CCAGGTGGAG TCGACGGCAG GTTCGTTGCA 60 

GGGCCAGTGG CGCGGCGCGG CGGGGACGGC CGCCCAGGCC GCGGTGGTGC GCTTCCAAGA 12 0 

AGCAGCCAAT AAGCAGAAGC AGGAACTCGA CGAGATCTCG ACGAATATTC GTCAGGCCGG 180 

CGTCCAATAC TCGAGGGCCG ACGAGGAGCA GCAGCAGGCG CTGTCCTCGC AAATGGGCTT 24 0 

CTGACCCGCT AATACGAAAA GAAACGGAGC AAAAACATGA CAGAGCAGCA GTGGAATTTC 3 00 

GCGGGTATCG AGGCCGCGGC AAGCGCAATC CAGGGAAATG TCACGTCCAT TCATTCCCTC 3 60 
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CTTGACGAGG GGAAGCAGTC CCTGACCAAG CTCGCA 3 96 

(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

{A} LENGTH: 80 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

lie Ser Gly Asp Leu Lys Thr Gin lie Asp Gin Val Glu Ser Thr Ala 
5 10 IS 

Gly Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin 

20 25 30 

Ala Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu 
35 40 45 

Leu Asp Glu lie Ser Thr Asn lie Arg Gin Ala Gly Val Gin Tyr Ser 
50 55 60 

Arg Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
65 70 75 80 

(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 87 base pairs 

(B) TYPE: nucleic acid 
(C; STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID N0:113: 

GTGGATCCCG ATCCC3TGTT TCGCTATTCT ACGCGAACTC GGCGTTGCCC TATGCGAACA 50 

TCCCAGTGAC GTTGCCTTCG GTCGAAGCCA TTGCCTGACC GGCTTCGCTG ATCGTCCGCG 120 

CCAGGTTCTG CAGCGCGTTG TTCAGCTCGG TAGCCGTGGC GTCCCATTTT TGCTGGACAC 18 0 

CCTGGTACGC CTCCGAACCG CTACCGCCCC AGGCCGCTGC GAGCTTGGTC AGGGACTGCT 240 

TCCCCTCGTC AAGGAGGGAA TGAATGGACG TGACATTTCC CTGGATTGCG CTTGCCGCGG 3 00 

CCTCGATACC CGCGAAATTC CACTGCTGCT CTGTCATGTT TTTGCTCCGT TTCTTTTCGT 360 

ATTAGCGGGT CAGAAGCCCA TTTGCGA 3 87 

(2) INFORMATION FOR SEQ ID NO: 114: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 272 base pairs 

(B) TYPE: micleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

CGGCACGAGG ATCTCGGTTG GCCCAACGGC GCTGGCGAGG GCTCCGTICC GGGGGCGAGC 60 

TGCGCGCCGG ATGCTTCCTC TGCCCGCAGC CGCGCCTGGA TGGATGGACC AGTTGCTACC 120 

TTCCCGACGT TTCGTTCGGT GTCTGTGCGA TAGCGGTGAC CCCGGCGCGC ACGTCGGGAG 180 

TGTTGGGGGG CAGGCCGGGT CGGTGGTTCG GCCGGGGACG CAGACGGTCT GGACGGAACG 24 0 

GGCGGGGGTT CGCCGATTGG CATCTTTGCC CA 272 
(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

Asp Pro Val Asp Ala Val lie Asn Thr Thr Cys Asn Tyr Gly Gin Val 
^5 10 15 

Val Ala Ala Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 
(3) TYPE: ammo acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

Ala Val Glu Ser Gly Met Leu Ala Leu Gly Thr Pro Ala Pro Ser 
^5 10 IS 

(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala Ala Lys 
15 10 15 

Glu Gly Arg 



(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp Pro Ala Trp Gly Pro 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 119: 

Asp lie Gly 3er Glu Ser Thr Glu Asp Gin Gin Xaa Ala Val 

1 5 LO 

(2) INFORMATION FOR 3EQ ID NO: 12 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 ammo acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

Ala Glu Glu Ser He Ser Thr Xaa Glu Xaa He Val Pro 

15 10 

(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 ammo acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Thr Ala Ala Ser Pro Pro 
IS 10 15 

Ser 



(2) INFORMATION FOR SEQ ID NO : 122 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

Ala Pro Lys Tlir Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 12 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 amino acids 

(B) TYPE: amino acxd 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Leu Thr Ser 

15 10 15 

Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala Asn 
20 25 30 

[2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

Asp Pro Pro Asp Pro His Gin Xaa Asp Met Thr Lys Gly Tyr Tyr Pro 

15 10 15 

Gly Gly Arg Arg Xaa Phe 
20 



2) INFORMATION FOR SEQ ID NO: 12 5: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 125: 

Asp Pro Gly Tyr Thr Pro Gly 

1 5 

(2} INFORMATION FOR SEQ ID NO: 126; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(D) OTHER INFORMATION: /note= "The Second Residue Can Be Either a 
Pro or Thr" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 126: 

Xaa Xaa Gly Phe Thr Gly Pro Gin Phe Tyr 
15 10 

(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: ammo acid 
(C; STRANDEDNESS: 

(D) TOPOLOGY: linear 

{ix) FEATURE: 

(D) OTHER INFORMATION; /note= "The Third Residue Can Be Either a 

Gin or Leu" 

txi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

Xaa Pro Xaa Val Thr Ala Tyr Ala Gly 

1 5 

(2) INFORMATION FOR SEQ ID NO: 12 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 
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Xaa Xaa Xaa Glu Lys Pro Phe Leu Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 12 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 
{O STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

Xaa Asp Ser Glu Lys Ser Ala Thr lie Lys Val Thr Asp Ala Ser 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

Ala Gly Asp Thr Xaa He Tyr He Val Gly Asn Leu Thr Ala Asp 
15 10 15 

(2) INFORMATION FOR SEQ ID N0:131i 

{i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

Ala Pro Glu Ser Gly Ala Gly Leu Gly Gly Thr Val Gin Ala Gly 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

Xaa Tyr He Ala Tyr Xaa Thr Thr Ala Gly He Val Pro Gly Lys He 
15 10 15 
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Asn Val His Leu Val 
20 

(2) INFORMATION FOR SEQ ID NO: 13 3: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 882 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOIiOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:133: 

GCAACGCTGT CGTGGCCTTT GCGGTGATCG GTTTCGCCTC GCTGGCGGTG GCGGTGGCGG 60 

TCACCATCCG ACCGACCGCG GCCTCAAAAC CGGTAGAGGG ACACCAAAAC GCCCAGCCAG 12 0 

GGAAGTTCAT GCCGTTGTTG CCGACGCAAC AGCAGGCGCC GGTCCCGCCG CCTCCGCCCG 180 

ATGATCCCAC CGCTGGATTC CAGGGCGGCA CCATTCCGGC TGTACAGAAC GTGGTGCCGC 24 0 

GGCCGGGTAC CTCACCCGGG GTGGGTGGGA CGCCGGCTTC GCCTGCGCCG GAAGCGCCGG 300 

CCGTGCCCGG TGTTGTGCCT GCCCCGGTGC CAATCCCGGT CCCGATCATC ATTCCCCCGT 360 

TCCCGGGTTG GCAGCCTGGA ATGCCGACCA TCCCCACCGC ACCGCCGACG ACGCCGGTGA 420 

CCACGTCGGC GACGACGCCG CCGACCACGC CGCCGACCAC GCCGGTGACC ACGCCGCCAA 480 

CGACGCC3CC GACCACGCCG GTGACCACGC CGCCAACGAC GCCGCCGACC ACGCCGGTGA 54 0 

CCACGCCACC AACGACCGTC GCCCCGACGA CCGTCGCCCC GACGACGGTC GCTCCGACCA 600 

CCGTCGCCCC GACGACGGTC GCTCCAGCCA CCGCCACGCC GACGACCGTC GCTCCGCAGC 660 

CGACGCAGCA GCCCACGCAA CAACCAACCC AACAGATGCC AACCCAGCAG CAGACCGTGG 720 

CCCCGCAGAC GGTGGCGCCG GCTCCGCAGC CGCCGTCCGG TGGCCGCAAC GGCAGCGGCG 7 80 

GGGGCGACTT ATTCGGCGGG TTCTGATCAC GGTCGCGGCT TCACTACGGT CGGAGGACAT 84 0 

GGCCGGTGAT GCGGTGACGG TGGTGCTGCC CTGTCTCAAC GA 9 9^ 
(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 815 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

CCATCAACCA ACCGCTCGCG CCGCCCGCGC CGCCGGATCC GCCGTCGCCG CCACGCCCGC 60 

CGGTGCCTCC GGTGCCCCCG TTGCCGCCGT CGCCGCCGTC GCCGCCGACC GGCTGGGTGC 120 

CTAGGGCGCT GTTACCGCCC TGGTTGGCGG GGACGCCGCC GGCACCACCG GTACCGCCGA 180 

TGGCGCCGTT GCCGCCGGCG GCACCGTTGC CACCGTTGCC ACCGTTGCCA CCGTTGCCGA 240 

CCAGCCACCC GCCGCGACCA CCGGCACCGC CGGCGCCGCC CGCACCGCCG GCGTGCCCGT 3 00 

TCGTGCCCGT ACCGCCGGCA CCGCCGTTGC CGCCGTCACC GCCGACGGAA CTACCGGCGG 360 

ACGCGGCCTG CCCGCCGGCG CCGCCCGCAC CGCCATTGGC ACCGCCGTCA CCGCCGGCTG 420 

GGAGTGCCGC GATTAGGGCA CTGACCGGCG CAACCAGCGC AAGTACTCTC GGTCACCGAG 480 

CACTTCCAGA CGACACCACA GCACGGGGTT GTCGGCGGAC TGGGTGAAAT GGCAGCCGAT 54 0 

AGCGGCTAGC TGTCGGCTGC GGTCAACCTC GATCATGATG TCGAGGTGAC CGTGACCGCG 600 

CCCCCCGAAG GAGGCGCTGA ACTCGGCGTT GAGCCGATCG GCGATCGGTT GGGGCAGTGC 660 

CCAGGCCAAT ACGGGGATAC CGGGTGTCNA AGCCGCCGCG AGCGCAGCTT CGGTTGCGCG 720 

ACNGTGGTCG GGGTGGCCTG TTACGCCGTT GTCNTCGAAC ACGAGTAGCA GGTCTGCTCC 780 

GGCGAGGGCA TCCACCACGC GTTGCGTCAG CTCGT 815 
(2) INFORMATION FOR SEQ ID >JO:13 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1152 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

ACCAGCCGCC GGCTGAGGTC TCAGATCAGA GAGTCTCCGG ACTCACCGGG GCGGTTCAGC 60 

CTTCTCCCAG AACAACTGCT GAAGATCCTC GCCCGCGAAA GAGGCGCTGA TTTGACGCTC 12 0 

TATGACCGGT TGAACGACGA GATCATCCGG CAGATTGATA TGGCACCGCT GGGCTAACAG 180 

GTGCGCAAGA TGGTGCAGCT GTATGTCTCG GACTCCGTGT CGCGGATCAG CTTTGCCGAC 24 0 

GGCCGGGTGA TCGTGTGGAG CGAGGAGCTC GGCGAGAGCC AGTATCCGAT CGAGACGCTG 3 00 

GACGGCATCA CGCTGTTTGG GCGGCCGACG ATGACAACGC CCTTCATCGT TGAGATGCTC 360 

AAGCGTGAGC GCGACATCCA GCTCTTCACG ACCGACGGCC ACTACCAGGG CCGGATCTCA 42 0 
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ACACCCGACG TGTCATACGC GCCGCGGCTC CGTCAGCAAG TTCACCGCAC CGACGATCCT 480 

GCGTTCTGCC TGTCGTTAAG CAAGCGGATC GTGTCGAGGA AGATCCTGAA TCAGCAGGCC 54 0 

TTGATTCGGG CACACACGTC GGGGCAAGAC GTTGCTGAGA GCATCCGCAC GATGAAGCAC 600 

TCGCTGGCCT GGGTCGATCG ATCGGGCTCC CTGGCGGAGT TGAACGGGTT CGAGGGAAAT 660 

GCCGCAAAGG CATACTTCAC CGCGCTGGGG CATCTCGTCC CGCAGGAGTT CGCATTCCAG 720 

GGCCGCTCGA CTCGGCCGCC GTTGGACGCC TTCAACTCGA TGGTCAGCCT CGGCTATTCG 7 80 

CTGCTGTACA AGAACATCAT AGGGGCGATC GAGCGTCACA GCCTGAACGC GTATATCGGT 84 0 

TTCCTACACC AGGATTCACG AGGGCACGCA ACGTCTCGTG CCGAATTCGG CACGAGCTCC 900 

GCTGAAACCG CTGGCCGGCT GCTCAGTGCC CGTACGTAAT CCGCTGCGCC CAGGCCGGCC 96 0 

CGCCGGCCGA ATACCAGCAG ATCGGACAGC GAATTGCCGC CCAGCCGGTT GGAGCCGTGC 1020 

ATACCGCCGG CACACTCACC GGCAGCGAAC AGGCCTGGCA CCGTGGCGGC GCCGGTGTCC 10 8 0 

GCGTCTACTT CGACACCGCC CATCACGTAG TGACACGTCG GCCCGACTTC CATTGCCTGC 1140 

GTTCGGCACG AG '^'^^'^ 
(2) INFORMATION FOR SEQ ID NO: 136; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 655 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

CTCCTGCCGA TTCGGCAGGG TGTACTTGCC GGTGGTGTAN GCCGCATGAG TGCCGACGAC 6 0 

CAGCAATGCG GCAACAGCAC GGATCCCGGT CAACGACGCC ACCCGGTCCA CGTGGGCGAT 12 Q 

CCGCTCGAGT CCGCCCTGGG CGGCTCTTTC CTTGGGCAGG GTCATCCGAC GTGTTTCCGC 130 

CGTGGTTTGC CGCCATTATG CCGGCGCGCC GCGTCGGGCG GCCGGTATGG CCGAANGTCG 24 0 
ATCAGCACAC CCGAGATACG GGTCTGTGCA AGCTTTTTGA GCGTCGGGCG GGGCAGCTTC 
GCCGGCAATT CTACTAGCGA GAAGTCTGGC CCGATACGGA TCTGACCGAA GTCGCTGCGG 

TGCAGCCCAC CCTCATTGGC GATGGCGCCG ACGATGGCGC CTGGACCGAT CTTGTGCCGC 42 0 

TTGCCGACGG CGACGCGGTA GGTGGTCAAG TCCGGTCTAC GCTTGGGCCT TTGCGGACGG 480 



300 
360 
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TCCCGACGCT GGTCGCGGTT GCGCCGCGAA AGCGGCGGGT CGGGTGCCAT CAGGAATGCC 54 0 

TCACCGCCGC GGCACTGCAC GGCCAGTGCC GCGGCGATGT CAGCCATCGG GACATCATGC 60 0 

TCGCGTTCAT ACTCCTCGAC CAGTCGGCGG AACAGCTCGA TTCCCGGACC GCCCA S55 
(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

Asn Ala Val Val Ala Phe Ala Val lie Gly Phe Ala Ser Leu Ala Val 
1 5 10 15 

Ala Val Ala Val Thr He Arg Pro Thr Ala Ala Ser Lys Pro Val Glu 
20 25 30 

Gly His Gin Asn Ala Gin Pro Gly Lys Phe Met Pro Leu Leu Pro Thr 
35 40 45 

Gin Gin Gin Ala Pro Val Pro Pro Pro Pro Pro Asp Asp Pro Thr Ala 
50 55 60 

Gly Phe Gin Gly Gly Thr He Pro Ala Val Gin Asn Val Val Pro Arg 
65 70 75 80 

Pro Gly Thr Ser Pro Gly Val Gly Gly Thr Pro Ala Ser Pro Ala Pro 
85 90 95 

Glu Ala Pro Ala Val Pro Gly Val Val Pro Ala Pro Val Pro He Pro 
100 105 HO 

Val Pro He He He Pro Pro Phe Pro Gly Trp Gin Pro Gly Met Pro 
115 120 125 

Thr He Pro Thr Ala Pro Pro Thr Thr Pro Val Thr Thr Ser Ala Thr 
130 135 140 

Thr Pro Pro Thr Thr Pro Pro Thr Thr Pro Val Thr Thr Pro Pro Thr 
145 150 155 160 

Thr Pro Pro Thr Thr Pro Val Thr Thr Pro Pro Thr Thr Pro Pro Thr 
165 170 175 

Thr Pro Val Thr Thr Pro Pro Thr Thr Val Ala Pro Thr Thr Val Ala 
180 185 190 

Pro Thr Thr Vai Ala Pro Thr Thr Val Ala Pro Thr Thr Val Ala Pro 
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195 200 205 

Ala Thr Ala Thr Pro Thr Thr Val Ala Pro Gin Pro Thr Gin Gin Pro 
210 215 220 

Thr Gin Gin Pro Thr Gin Gin Met Pro Thr Gin Gin Gin Thr Val Ala 
225 230 235 240 

Pro Gin Thr Val Ala Pro Ala Pro Gin Pro Pro Ser Gly Gly Arg Asn 
245 250 255 

Gly Ser Gly Gly Gly Asp Leu Phe Gly Gly Phe 
260 265 

(2) INFORMATION FOR SEQ ID NO :13a: 

(i) SEQUENCS CHARACTERISTICS: 

(A) LENGTH: 174 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 

lie Asn Gin Pro Leu Ala Pro Pro Ala Pro Pro Asp Pro Pro Ser Pro 
15 10 15 

Pro Arg Pro Pro Val Pro Pro Val Pro Pro Leu Pro Pro Ser Pro Pro 
20 25 30 

Ser Pro Pro Thr Gly Trp Val Pro Arg Ala Leu Leu Pro Pro Trp Leu 
35 40 45 

Ala Gly Thr Pro Pro Ala Pro Pro Val Pro Pro Met Ala Pro Leu Pro 
50 55 60 

Pro Ala Ala Pro Leu Pro Pro Leu Pro Pro Leu Pro Pro Leu Pro Thr 
55 70 75 80 

Ser His Pro Pro Arg Pro Pro Ala Pro Pro Ala Pro Pro Ala Pro Pro 
85 90 95 

Ala Cys Pro Phe Val Pro Val Pro Pro Ala Pro Pro Leu Pro Pro Ser 
100 105 110 

Pro Pro Thr Glu Leu Pro Ala Asp Ala Ala Cys Pro Pro Ala Pro Pro 
115 120 125 

Ala Pro Pro Leu Ala Pro Pro Ser Pro Pro Ala Gly Ser Ala Ala lie 
130 135 140 

Arg Ala Leu Thr Gly Ala Thr Ser Ala Ser Thr Leu Gly His Arg Ala 
145 150 155 160 
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Leu Pro Asp Asp Thr Thr Ala Arg Gly Cys Arg Arg Thr Gly 
165 170 

(2) INFORMATION FOR SEQ ID NO: 13 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

Gin Pro Pro Ala Glu Val Ser Asp Gin Arg Val Ser Gly Leu Thr Gly 

15 10 15 

Ala Val Gin Pro Ser Pro Arg Thr Thr Ala Glu Asp Pro Arg Pro Arg 
20 25 30 

Asn Arg Arg 
35 

(2) INFORMATION FOR SEQ ID NO: 14 0: 

(i) SEQUENCE CHAiiACTERISTICS : 

(A) LENGTH: 104 amino acids 

(B) TYPE: amino acid 

(CI STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pept-de 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 0: 

Arg Ala Asp Ser Ala Gly Cys Thr Cys Arg Trp Cys Xaa Pro His Glu 

1 5 10 15 

Cys Arg Arg Pro Ala Mec Arg Gin Gin His Gly Ser Arg Ser Thr Thr 
20 25 30 

Pro Pro Gly Pro Arg Gly Arg Ser Ala Arg Val Arg Pro Gly Arg Leu 
35 40 45 

Phe Pro Trp Ala Gly Ser Ser Asp Val Phe Pro Pro Trp Phe Ala Ala 
50 55 60 

lie Mec Pro Ala Arg Arg Val Gly Arg Pro Val Trp Pro Xaa Val Asp 
65 70 75 80 

Gin His Thr Arg Asp Thr Gly Leu Cys Lys Leu Phe Glu Arg Arg Ala 
as 90 95 
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Gly Gin Leu Arg Arg Gin Phe Tyr 
100 

(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: S3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

(A) DESCRIPTION: /desc = "PCR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 

GGATCCATAT GGGCCATCAT CATCATCATC ACGTGATCGA CATCATCGGG ACC 53 

(2) INFORMATION FOR SEQ ID NO:142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc =r "PCR Primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 2: 

CCTGAATTCA GGCCTCGGTT GCGCCGGCCT CATCTTGAAC GA 42 

(2) INFORMATION FOR SEQ ID MO: 14 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR Primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 
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GGATCCTGCA GGCTCGAAAC CACCGAGCGG T 31 
(2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PGR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium txiberculosis 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 144 : 

CTCTGAATTC AGCGCTGGAA ATCGTCGCGA T 31 

(2) INFORMATION FOR SEQ ID NO:145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

(A) DESCRIPTION: /desc = "PGR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

GGATCCAGCG CTGAGATGAA GACCGATGCC GCT 3 3 

(2) INFORMATION FOR SEQ ID NO: 14 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PGR primer" 

(VI ) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 

GAGAGAATTC TCAGAAGCCC ATTTGCGAGG ACA 3 3 
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(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANTEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 152. .1273 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

TGTTCTTCGA CGGCAGGCTG GTGGAGGAAG GGCCCACCGA ACAGCTGTTC TCCTCGCCGA 60 

AGCATGCGGA AACCGCCCGA TACGTCGCCG GACTGTCGGG GGACGTCAAG GACGCCAAGC 120 

GCGGAAATTG AAGAGCACAG AAAGGTATGG C GTG AAA ATT CGT TTG CAT ACG 172 

Val Lys lie Arg Leu His Thr 
1 5 

CTG TTG GCC GTG TTG ACC GCT GCG CCG CTG CTG CTA GCA GCG GCG GGC 22 0 

Leu Leu Ala Val Leu Thr Ala Ala Pro Leu Leu Leu Ala Ala Ala Gly 

10 15 20 

TGT GGC TCG AAA CCA CCG AGC GGT TCG CCT GAA ACG GGC GCC GGC GCC 2 68 

Cys Gly Ser Lys Pro Pro Ser Gly Ser Pro Glu Thr Gly Ala Gly Ala 
25 30 35 

GGT ACT GTC GCG ACT ACC CCC GCG TCG TCG CCG GTG ACG TTG GCG GAG 316 
Gly Thr Val Ala Thr Thr Pro Ala Ser Ser Pro Val Thr Leu Ala Glu 
40 45 50 55 

ACC GGT AGC ACG CTG CTC TAG CCG CTG TTC AAC CTG TGG GGT CCG GCC 3 64 

Thr Gly Ser Thr Leu Leu Tyr Pro Leu Phe Asn Leu Trp Gly Pro Ala 
60 65 70 

TTT CAC GAG AGG TAT CCG AAC GTC ACG ATC ACC GCT CAG GGC ACC GGT 412 
Phe His Glu Arg Tyr Pro Asn Val Thr He Thr Ala Gin Gly Thr Gly 
75 80 85 

TCT GGT GCC GGG ATC GCG CAG GCC GCC GCC GGG ACG GTC AAC ATT GGG 460 
Ser Gly Ala Gly He Ala Gin Ala Ala Ala Gly Thr Val Asn He Gly 
90 95 100 



GCC TCC GAC GCC TAT CTG TCG GAA GGT GAT ATG GCC GCG CAC AAG GGG 
Ala Ser Asp Ala Tyr Leu Ser Glu Gly Asp Met Ala Ala His Lys Gly 

105 110 115 
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CTG ATG AAC ATC GCG CTA GCC ATC TCC GCT CAG CAG GTC AAC TAC AAC 556 

Leu Met Asn He Ala Leu Ala He Ser Ala Gin Gin Val Asn Tyr Asn 
120 125 130 135 

CTG CCC GGA GTG AGC GAG CAC CTC AAG CTG AAC GGA AAA GTC CTG GCG 604 

Leu Pro Gly Val Ser Glu His Leu Lys Leu Asn Gly Lys Val Leu Ala 
140 145 150 

GCC ATG TAC CAG GGC ACC ATC AAA ACC TGG GAC GAC CCG CAG ATC GCT 652 

Ala Met Tyr Gin Gly Thr He Lys Thr Trp Asp Asp Pro Gin lie Ala 

155 160 165 

GCG CTC AAC CCC GGC GTG AAC CTG CCC GGC ACC GCG GTA GTT CCG CTG 70 0 

Ala Leu Asn Pro Gly Val Asn Leu Pro Gly Thr Ala Val Val Pro Leu 
170 175 180 

CAC CGC TCC GAC GGG TCC GGT GAC ACC TTC TTG TTC ACC CAG TAC CTG 74 8 

His -Aj:g Ser Asp Gly Ser Gly Asp Thr Phe Leu Phe Thr Gin Tyr Leu 
185 190 195 

TCC AAG CAA GAT CCC GAG GGC TGG GGC AAG TCG CCC GGC TTC GGC ACC 796 

Ser Lys Gin Asp Pro Glu Gly Trp Gly Lys Ser Pro Gly Phe Gly Thr 
200 205 210 215 

ACC GTC GAC TTC CCG GCG GTG CCG GGT GCG CTG GGT GAG AAC GGC AAC 844 

Thr Val Asp Phe Pro Ala Val Pro Gly Ala Leu Gly Glu Asn Gly Asn 
220 225 230 

GGC GGC ATG GTG ACC GGT TGC GCC GAG ACA CCG GGC TGC GTG GCC TAT 892 

Gly Gly Met Val Thr Gly Cys Ala Glu Thr Pro Gly Cys Val Ala Tyr 

235 240 245 

ATC GGC ATC AGC TTC CTC GAC CAG GCC AGT CAA CGG GGA CTC GGC GAG 94 0 

He Gly He Ser Phe Leu Asp Gin Ala Ser Gin Arg Gly Leu Gly Glu 
250 255 260 

GCC CAA CTA GGC AAT AGC TCT GGC AAT TTC TTG TTG CCC GAC GCG CAA 988 

Ala Gin Leu Gly Asn Ser Ser Gly Asn Phe Leu Leu Pro Asp Ala Gin 

265 270 275 

AGC ATT CAG GCC GCG GCG GCT GGC TTC GCA TCG AAA ACC CCG GCG AAC 1036 

Ser He Gin Ala Ala Ala Ala Gly Phe Ala Ser Lys Thr Pro Ala Asn 
280 285 290 295 

CAG GCG ATT TCG ATG ATC GAC GGG CCC GCC CCG GAC GGC TAC CCG ATC 10 84 

Gin Ala He Ser Met He Asp Gly Pro Ala Pro Asp Gly Tyr Pro He 
300 305 310 

ATC AAC TAC GAG TAC GCC ATC GTC AAC AAC CGG CAA AAG GAC GCC GCC 1132 

He Asn Tyr Glu Tyr Ala He Val Asn Asn Arg Gin Lys Asp Ala Ala 

315 320 325 

ACC GCG CAG ACC TTG CAG GCA TTT CTG CAC TGG GCG ATC ACC GAC GGC 118 0 

Thr Ala Gin Thr Leu Gin Ala Phe Leu His Trp Ala He Thr Asp Gly 
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330 335 340 

AAC AAG GCC TCG TTC CTC GAC CAG GTT CAT TTC GAG CCG CTG CCG CCC 1228 
Asn Lys Ala Ser Phe Leu Asp Gin Val His Phe Gin Pro Leu Pro Pro 
345 350 355 

GCG GTG GTG AAG TTG TCT GAC GCG TTG ATC GCG ACG ATT TCC AGC 1273 
Ala Val Val Lys Leu Ser Asp Ala Leu He Ala Thr He Ser Ser 
360 365 370 

TAGCCTCGTT GACCACCACG CGACAGCAAC CTCCGTCGGG CCATCGGGCT GCTTTGCGGA 1333 

GCATGCTGGC CCGTGCCGGT GAAGTCGGCC GCGCTGGCCC GGCCATCCGG TGGTTGGGTG 13 93 

GGATAGGTGC GGTGATCCCG CTGCTTGCGC TGGTCTTGGT GCTGGTGGTG CTGGTCATCG 1453 

AGGCGATGGG TGCGATCAGG CTCAACGGGT TGCATTTCTT CACCGCCACC GAATGGAATC 1513 

CAGGCAACAC CTACGGCGAA ACCGTTGTCA CCGACGCGTC GCCCATCCGG TCGGCGCCTA 1573 

CTACGGGGCG TTGCCGCTGA TCGTCGGGAC GCTGGCGACC TCGGCAATCG CCCTGATCAT 163 3 

CGCGGTGCCG GTCTCTGTAG GAGCGGCGCT GGTGATCGTG GAACGGCTGC CGAAACGGTT 1693 

GGCCGAGGCT GTGGGAATAG TCCTGGAATT GCTCGCCGGA ATCCCCAGCG TGGTCGTCGG 1753 

TTTGTGGGGG GCAATGACGT TCGGGCCGTT CATCGCTCAT CACATCGCTC CGGTGATCGC 1813 

TCACAACGCT CCCGATGTGC CGGTGCTGAA CTACTTGCGC GGCGACCCGG GCAACGGGGA 1873 

GGGCATGTTG GTGTCCGGTC TGGTGTTGGC GGTGATGGTC GTTCCCATTA TCGCCACCAC 193 3 

CACTCATGAC CTGTTCCGGC AGGTGCCGGT GTTGCCCCGG GAGGGCGCGA TCGGGAATTC 1993 
(2) INFORMATION FOR SEQ ID NO: 14 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 374 ammo acids 

(B) TYPE: ammo acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 8: 

Val Lys He Arg Leu His Thr Leu Leu Ala Val Leu Thr Ala Ala Pro 

5 10 15 

Leu Leu Leu Ala Ala Ala Gly Cys Gly Ser Lys Pro Pro Ser Gly Ser 
20 25 30 

Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro Ala Ser 
35 40 45 

Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr Pro Leu 
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Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn Val Thr 
55 70 75 80 

lie Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly He Ala Gin Ala Ala 
85 90 95 

Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser Glu Gly 
100 105 110 

Asp Met Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala He Ser 
115 120 125 

Ala Gla Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His Leu Lys 
130 135 140 

Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr He Lvs Thr 
145 150 155 ' 160 

Trp Asp Asp Pro Gin He Ala Ala Leu Asn Pro Gly Val Asn Leu Pro 
165 170 175 

Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly Asp Thr 
180 185 190 

Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly Trp Gly 
195 200 205 

Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val Pro Gly 
210 215 220 

Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys Ala Glu 
225 230 235 240 

Thr Pro Gly Cys Val Ala Tyr He Gly He Ser Phe Leu Asp Gin Ala 
245 250 255 

Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser Gly Asn 
260 265 270 

Phe Leu Leu Pro Asp Ala Gin Ser He Gin Ala Ala Ala Ala Gly Phe 
275 280 285 

Ala Ser Lys Thr Pro Ala Asn Gin Ala He Ser Met He Asp Gly Pro 
290 295 300 

Ala Pro Asp Gly Tyr Pro He He Asn Tyr Glu Tyr Ala He Val Asn 
305 310 315 320 



Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala Phe Leu 
325 330 335 



His Trp Ala He Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp Gin Val 
340 345 
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His Phe Gin Pro Leu Pro Pro Ala Val Val hys Leu Ser Asp Ala Leu 
355 360 365 

lie Ala Thr lie Ser Ser 
370 

(2) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 

TGTTCTTCGA CGGCAGGCTG GTGGAGGAAG GGCCCACCGA ACAGCTGTTC TCCTCGCCGA 60 

AGCATGCGGA AACCGCCCGA TACGTCGCCG GACTGTCGGG GGACGTCAAG GACGCCAAGC 12 0 

GCGGAAATTG AAGAGCACAG AAAGGTATGG CGTGAAAATT CGTTTGCATA CGCTGTTGGC 18 0 

CGTGTTGACC GCTGCGCCGC TGCTGCTAGC AGCGGCGGGC TGTGGCTCGA AACCACCGAG 24 0 

CGGTTCGCCT GAAACGGGCG CCGGCGCCGG TACTGTCGCG ACTACCCCCG CGTCGTCGCC 300 

GGTGACGTTG GCGGAGACCG GTAGCACGCT GCTCTACCCG CTGTTCAACC TGTGGGGTCC 360 

GGCCTTTCAC GAGAGGTATC CGAACGTCAC GATCACCGCT CAGGGCACCG GTTCTGGTGC 42 0 

CGGGATCGCG CAGGCCGCC3 CCGGGACGGT CAACATTGGG GCCTCCGACG CCTATCTGTC 4 80 

GGAAGGTGAT ATGGCCGCGC ACAAGGGGCT GATGAACATC GCGCTAGCCA TCTCCGCTCA 54 0 

GCAGGTCAAC TACAACCTGC CCGGAGTGAG CGAGCACCTC AAGCTGAACG GAAAAGTCCT 50 0 

GGCGGCCATG TACCAGGGCA CCATCAAAAC CTGGGACGAC CCGCAGATCG CTGCGCTCAA 660 

CCCCGGCGTG .^CCTGCCCG GCACCGCGGT AGTTCCGCTG CACCGCTCCG ACGGGTCCGG 72 0 

TGACACCTTC TTGTTCACCC AGTACCTGTC CAAGCAAGAT CCCGAGGGCT GGGGCAAGTC 7 80. 

GCCCGGCTTC GGCACCACCG TCGACTTCCC GGCGGTGCCG GGTGCGCTGG GTGAGAACGG 84 0 

CAACGGCGGC ATGGTGACCG GTTGCGCCGA GACACCGGGC TGCGTGGCCT ATATCGGCAT 900 

CAGCTTCCTC GACCAGGCCA GTCAACGGGG ACTCGGCGAG GCCCAACTAG GCAATAGCTC 960 

TGGCAATTTC TTGTTGCCCG ACGCGCAAAG CATTCAGGCC GCGGCGGCTG GCTTCGCATC 1020 

GAAAACCCCG GCGAACCAGG CGATTTCGAT GATCGACGGG CCCGCCCCGG ACGGCTACCC 108 0 

GATCATCAAC TACGAGTACG CCATCGTCAA CAACCGGCAA AAGGACGCCG CCACCGCGCA 114 0 



wo 99/42118 



100 



PCT/US99/03265 



GACCTTGCAG GCATTTCTGC ACTGGGCGAT CACCGACGGC AACAAGGCCT CGTTCCTCGA 1200 

CCAGGTTCAT ITCCAGCCGC TGCCGCCCGC GGTGGTGAAG TTGTCTGACG CGTTGATCGC 1260 

GACGATTTCC AGCTAGCCTC GTTGACCACC ACGCGACAGC AACCTCCGTC GGGCCATCGG 13 2 0 

GCTGCTTTGC GGAGCATGCT GGCCCGTGCC GGTGAAGTCG GCCGCGCTGG CCCGGCCATC 13 80 

CGGTGGTTGG GTGGGATAGG TGCGGTGATC CCGCTGCTTG CGCTGGTCTT GGTGCTGGTG 1440 

GTGCTGGTCA TCGAGGCGAT GGGTGCGATC AGGCTCAACG GGTTGCATTT CTTCACCGCC 1500 

ACCGAATGGA ATCCAGGCAA CACCTACGGC GAAACCGTTG TCACCGACGC GTCGCCCATC 1560 

CGGTCGGCGC CTACTACGGG GCGTTGCCGC TGATCGTCGG GACGCTGGCG ACCTCGGCAA 162 0 

TCGCCCTGAT CATCGCGGTG CCGGTCTCTG TAGGAGCGGC GCTGGTGATC GTGGAACGGC 1680 

TGCCGAAACG GTTGGCCGAG GCTGTGGGAA TAGTCCTGGA ATTGCTCGCC GGAATCCCCA 174 0 

GCGTGGTCGT CGGTTTGTGG GGGGCAATGA CGTTCGGGCC GnCATCGCT CATCACATCG 18 00 

CTCCGGTGAT CGCTCACAAC GCTCCCGATG TGCCGGTGCT GAACTACTTG CGCGGCGACC 1860 

CGGGCAACGG GGAGGGCATG TTGGTGTCCG GTCTGGTGTT GGCGGTGATG GTCGTTCCCA 1920 

TTATCGCCAC CACCACTCAT GACCTGTTCC GGCAGGTGCC GGTGTTGCCC CGGGAGGGCG 1980 
CGATCGGGAA TTC 

(2) INFORMATION FOR SEQ ID NO: ISO: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 74 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

Met Lys lie Arg Leu His Thr Leu Leu Ala Val Leu Thr Ala Ala Pro 
^5 10 15 

Leu Leu Leu Ala Ala Ala Gly Cys Gly Ser Lys Pro Pro Ser Gly Se- 
20 25 30 

Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro Ala Ser 
35 40 45 

Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr Pro Leu 

55 60 

Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn Val Thr 
^5 *70 75 80 
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lie Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly He Ala Gin Ala Ala 
85 90 95 

Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser Glu Gly 
100 105 110 

Asp Met Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala He Ser 
3-15 120 125 

Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His Leu Lys 
130 135 140 

Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr He Lys Thr 
145 150 155 160 

Trp Asp Asp Pro Gin He Ala Ala Leu Asn Pro Gly Val Asn Leu Pro 
165 170 175 

Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly Asp Thr 
180 185 190 

Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly Trp Gly 
195 200 205 

Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val Pro Gly 
210 215 220 

Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys Ala Glu 
225 230 235 240 

Thr Pro Gly Cys Val Ala Tyr He Gly He Ser Phe Leu Asp Gin Ala 
245 250 255 

Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser Gly Asn 
260 265 270 

Phe Leu Leu Pro Asp Ala Gin Ser He Gin Ala Ala Ala Ala Gly Phe 
275 280 285 

Ala Ser Lys Thr Pro Ala Asn Gin Ala He Ser Met He Asp Gly Pro 
290 295 300 

Ala Pro Asp Gly Tyr Pro He He Asn Tyr Glu Tyr Ala He Val Asn 
305 310 315 320 

Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala Phe Leu 
325 330 335 

His Trp Ala He Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp Gin Val 
340 345 



His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp Ala Leu 
355 360 365 



He Ala Thr He Ser Ser 
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370 

(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1777 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

GGTCTTGACC ACCACCTGGG TGTCGAAGTC GGTGCCCGGA TTGAAGTCCA GGTACTCGTG 60 

GGTGGGGCGG GCGAAACAAT AGCGACAAGC ATGCGAGCAG CCGCGGTAGC CGTTGACGGT 120 

GTAGCGAAAC GGCAACGCGG CCGCGTTGGG CACCTTGTTC AGCGCTGATT TGCACAACAC 180 

CTCGTGGAAG GTGATGCCGT CGAATTGTGG CGCGCGAACG CTGCGGACCA GGCCGATCCG 24 0 

CTGCAACCCG GCAGCGCCCG TCGTCAACGG GCATCCCGTT CACCGCGACG GCTTGCCGGG 300 

CCCAACGCAT ACCATTATTC GAACAACCGT TCTATACTTT GTCAACGCTG GCCGCTACCG 360 

AGCGCCGCAC AGGATGTGAT ATGCCATCTC TGCCCGCACA GACAGGAGCC AGGCCTTATG 42 0 

ACAGCATTCG GCGTCGAGCC CTACGGGCAG CCGAAGTACC TAGAAATCGC CGGGAAGCGC 48 0 

ATGGCGTATA TCGACGAAGG CAAGGGTGAC GCCATCGTCT TTCAGCACGG CAACCCCACG 540 

TC3TCTTACT TGTGGCGCAA CATCATGCCG CACTTGGAAG GGCTGGGCCG GCTGGTGGCC 60 0 

TGCGATCTGA TCGGGATGGG CGC'JTCGGAC AAGCTCAGCC CATCGGGACC CGACCGCTAT 66 0 

AGCTATGGC3 AGCAACGAGA CTTTTTGTTC GCGCTCTGGG ATGCGCTCGA CCTCGGCGAC 72 0 

CACGTGGTAC TGGTGCTGCA CGACTGGGGC TCGGCGCTCG GCTTCGACTG GGCTAACCAG 78 0 

CATCGGGACC GAGTGCAGGG GATCGCGTTC ATGGAAGCGA TCGTCACCCC GATGACGTGG 34 0 

GCGGACTGGC CGCCGGCCGT GCGGGGTGTG TTCCAGGGTT TCCGATCGCC TCAAGGCGAG 900 

CCAATGGCGT TGGAGCACAA CATCTTTGTC GAACGGGTGC TGCCCGGGGC GATCCTGCGA 96 0 

CAGCTCAGCG ACGAGGAAAT GAACCACTAT CGGCGGCCAT TCGTGAACGG CGGCGAGGAC 102 0 

CGTCGCCCCA CGTTGTCGTG GCCACGAAAC CTTCCAATCG ACGGTGAGCC CGCCGAGGTC 1080 

GTCGCGTTGG TCAACGAGTA CCGGAGCTGG CTCGAGGAAA CCGACATGCC GAAACTGTTC 114 0 

ATCAACGCCG AGCCCGGCGC GATCATCACC GGCCGCATCC GTGACTATGT CAGGAGCTGG 12 0 0 

CCCAACCAGA CCGAAATCAC AGTGCCCGGC GTGCATTTCG TTCAGGAGGA CAGCGATGGC 1260 

GTCGTATCGT GGGCGGGCGC TCGGCAGCAT CGGCGACCTG GGAGCGCTCT CATTTCACGA 13 2 0 
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GACCAAGAAT GTGATTTCCG GCGAAGGCGG CGCCCTGCTT GTCAACTCAT AAGACTTCCT 1380 

GCTCCGGGCA GAGATTCTCA GGGAAAAGGG CACCAATCGC AGCCGCTTCC TTCGCAACGA 1440 

GGTCGACAAA TATACGTGGC AGGACAAAGG TCTTCCTATT TGCCCAGCGA ATTAGTCGCT 150 0 

GCCTTTCTAT GGGCTCAGTT CGAGGAAGCC GAGCGGATCA CGCGTATCCG ATTGGACCTA 1560 

TGGAACCGGT ATCATGAAAG CTTCGAATCA TTGGAACAGC GGGGGCTCCT GCGCCGTCCG 1620 

ATCATCCCAC AGGGCTGCTC TCACAACGCC CACATGTACT ACGTGTTACT AGCGCCCAGC 1680 

GCCGATCGGG AGGAGGTGCT GGCGCGTCTG ACGAGCGAAG GTATAGGCGC GGTCTTTCAT 174 0 

TACGTGCCGC TTCACGATTC GCCGGCCGGG CGTCGCT 1777 
(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 324 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

GAGATTGAAT CGTACCGGTC TCCTTAGCGG CTCCGTCCCG TGAATGCCCA TATCACGCAC 60 

GGCCATGTTC TGGCTGTCGA CCTTCGCCCC ATGCCCGGAC GTTGGTAAAC CCAGGGTTTG 12 0 

ATCAGTAATT CCGGGGGACG GTTGCGGGAA GGCGGCCAGG ATGTGCGTGA GCCGCGGCGC ISO 

CGCCGTCGCC CAGGCGACCG CTGGATGCTC AGCCCCGGTG CGGCGACGTA GCCAGCGTTT 240 

GGCGCGTGTC GTCCACAGTG GTACTCCGGT GACGACGCGG CGCGGTGCCT GGGTGAAGAC 3 00 

CGTGACCGAC GCCGCCGATT CAGA 324 
(2) INFORMATION FOR SEQ ID NO: 15 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1338 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

GCGGTACCGC CGCGTTGCGC TGGCACGGGA CCTGTACGAC CTGAACCACT TCGCCTCGCG 60 

AACGATTGAC GAACCGCTCG TGCGGCGGCT GTGGGTGCTC AAGGTGTGGG GTGATGTCGT 12 0 

CGATGACCGG CGCGGCACCC GGCCACTACG CGTCGAAGAC GTCCTCGCCG CCCGCAGCGA 190 



wo 99/42118 



104 



PCT/US99/03265 



GCACGACTTC CAGCCCGACT CGATCGGCGT GCTGACCCGT CCTGTCGCTA TGGCTGCCTG 24 0 

GGAAGCTCGC GTTCGGAAGC GATTTGCGTT CCTCACTGAC CTCOACGCCG ACGAGCAGCG 300 

GTGGGCCGCC TGCGACGAAC GGCACCGCCG CGAAGTGGAG AACGCGCTGG CGGTGCTGCG 360 

GTCCTGATCA ACCTGCCGGC GATCGTGCCG TTCCGCTGGC ACGGTTGCGG CTGGACGCGG 420 

CTGAATCGAC TAGATGAGAG CAGTTGGGCA CGAATCCGGC TGTGGTGGTG AGCAAGACAC 480 

GAGTACTGTC ATCACTATTG GATGCACTGG ATGACCGGCC TGATTCAGCA GGACCAATGG 54 0 

AACTGCCCGG GGCAAAACGT CTCGGAGATG ATCGGCGTCC CCTCGGAACC CTGCGGTGCT 600 

GGCGTCATTC GGACATCGGT CCGGCTCGCG GGATCGTGGT GACGCCAGCG CTGAAGGAGT 660 

GGAGCGCGGC GGTGCACGCG CTGCTGGACG GCCGGCAGAC GGTGCTGCTG CGTAAGGGCG 72 0 

GGATCGGCGA GAAGCGCTTC GAGGTGGCGG CCCACGAGTT CTTGTTGTTC CCGACGGTCG 780 

CGCACAGCCA CGCCGAGCGG GTTCGCCCCG AGCACCGCGA CCTGCTGGGC CCGGCGGCCG 84 0 

CCGACAGCAC CGACGAGTGT GTGCTACTGC GGGCCGCAGC GAAAGTTGTT GCCGCACTGC 900 

CGGTTAACCG GCCAGAGGGT CTGGACGCCA TCGAGGATCT GCACATCTGG ACCGCCGAGT 960 

CGGTGCGCGC CGACCGGCTC GACTTTCGGC CCAAGCACAA ACTGGCCGTC TTGGTGGTCT 1020 

CGGCGATCCC GCTGGCCGAG CCGGTCCGGC TGGCGCGTAG GCCCGAGTAC GGCGGTTGCA 1080 

CdAGCTGGGT GCAGCTGCCG GTGACGCCGA CGTTGGCGGC GCCGGTGCAC GACGAGGCCG 114 0 

CGCrGGCCGA GGTCGCCGCC CGGGTCCGCG AGGCCGTGGG TTGACTGGGC GGCATCGCTT 12 0 0 

GGGTCTGAGC TGTACGCCCA GTCGGCGCTG CGAGTGATCT GCTGTCGGTT CGGTCCCTGC 12 60 

TGGCGTCAAT TGACGGCGCG GGCAACAGCA GCATTGGCGG CGCCATCCTC CGCGCGGCCG 132 0 

GCGCCCACCG CTACAACC 13 3 8 
(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 321 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

CCGGCGGCAC CGGCGGCACC GGCGGTACCG GCGGCAACGG CGCTGACGCC GCTGCTGTGG 60 

TGGGCTTCGG CGCGAACGGC GACCCTGGCT TCGCTGGCGG CAAAGGCGGT AACGGCGGAA 12 0 

TAGGTGGGGC CGCGGTGACA GGCGGGGTCG CCGGCGACGG CGGCACCGGC GGCAAAGGTG 180 
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GCACCGGCGG TGCCGGCGGC GCCGGCAACG ACGCCGGCAG CACCGGCAAT CCCGGCGGTA 240 

AGGGCGGCGA CGGCGGGATC GGCGGTGCCG GCGGGGCCGG CGGCGCGGCC GGCACCGGCA 300 

ACGGCGGCCA TGCCGGCAAC C 321 
(2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 492 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:155: 

GAAGACCCGG CCCCGCCATA TCGATCGGCT CGCCGACTAC TTTCGCCGAA CGTGCACGC3 60 

GCGGCGTCGG GCTGATCATC ACCGGTGGCT ACGCGCCCAA CCGCACCGGA TGGCTGCTGC 120 

CGTTCGCCTC CGAACTCGTC ACTTCGGCGC AAGCCCGACG GCACCGCCGA ATCACCAGGG 180 

CGGTCCACGA TTCGGGTGCA AAGATCCTGC TGCAAATCCT GCACGCCGGA CGCTACGCCT 24 0 

ACCACCCACT TGCGGTCAGC GCCTCGCCGA TCAAGGCGCC GATCACCCCG TTTCGTCCGC 300 

GAGCACTATC GGCTCGCGGG GTCGAAGCGA CCATCGCGGA TTTCGCCCGC TGCGCGCAGT 360 

TGGCCCGCGA TGCCGGCTAC GACGGCGTCG AAATCATGGG CAGCGAAGGG TATCTGCTCA 420 

ATCAGTTCCT GGCGCCGCGC ACCAACAAGC GCACCGACTC GTGGGGCGGC ACACCGGCCA 480 

kCZ3TC0CC^ GT 4 92 
i2) INFORMATION FOR SEQ ID NO: 15 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 6 amino acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

Phe Ala Gin His Leu Val Glu Gly Asp Ala Val Glu Leu Trp Arg Ala 
5 10 15 

Asn Ala Ala Asp Gin Ala Asp Pro Leu Gin Pro Gly Ser Ala Arg Arg 
20 25 30 

Gin Arg Ala Ser Arg Ser Pro Arg Arg Leu Ala Gly Pro Asn Ala Tyr 
35 40 45 

His Tyr Ser Asn Asn Arg Ser He Leu Cys Gin Arg Trp Pro Leu Pro 
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Ser Ala Ala Gin Asp Val lie Cys His Leu Cys Pro His Arg Gin Glu 
65 70 75 80 

Pro Gly Leu Met Thr Ala Phe Gly Val Glu Pro Tyr Gly Gin Pro Lys 
95 90 95 

Tyr Leu Glu lie Ala Gly Lys Arg Met Ala Tyr lie Asp Glu Gly Lys 
100 105 

Gly Asp Ala lie Val Phe Gin His Gly Asn Pro Thr Ser Ser Tyr Leu 
115 120 125 

Trp Arg Asn He Met Pro His Leu Glu Gly Leu Gly Arg Leu Val Ala 
130 135 

Cys Asp Leu He Gly Met Gly Ala Ser Asp Lys Leu Ser Pro Ser Gly 
150 155 160 

Pro Asp Arg Tyr Ser Tyr Gly Glu Gin Arg Asp Phe Leu Phe Ala Leu 
1S5 170 175 

Trp Asp Ala Leu Asp Leu Gly Asp His Val Val Leu Val Leu His Asp 
180 185 190 

Trp Gly Ser Ala Leu Gly Phe Asp Trp Ala Asn Gin His Arg Asp Arg 
195 200 205 

Val Gin Gly He Ala Phe Met Glu Ala lie Val Thr Pro Met Thr Trp 

210 215 220 

Ala Asp Trp Pro Pro Ala Val Arg Gly Val Phe Gin Gly Phe Arg Ser 
225 230 235 240 

Pro Gin Gly Glu Pro Met Ala Leu Glu His Asn He Phe Val Glu Arg 
245 250 255 

Val Leu Pro Gly Ala He Leu Arg Gin Leu Ser Asp Glu Glu Met Asn 
260 265 270 

His Tyr Arg Arg Pro Phe Val Asn Gly Gly Glu Asp Arg Arg Pro Thr 
275 280 285 

Leu Ser Trp Pro Arg Asn Leu Pro He Asp Gly Glu Pro Ala Glu Val 
290 295 300 

Val Ala Leu Val Asn Glu Tyr Arg Ser Trp Leu Glu Glu Thr Asp Met 

305 310 315 320 



Pro Lys Leu Phe He Asn Ala Glu Pro Gly Ala He He Thr Gly Arg 
325 330 335 



He Arg Asp Tyr Val Arg Ser Trp Pro Asn Gin Thr Glu He Thr Val 
340 345 350 
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Pro Gly Val His Phe Val Gin Glu Asp Ser Asp Gly Val Val Ser Trp 
355 360 365 

Ala Gly Ala Arg Gin His Arg Arg Pro Gly Ser Ala Leu He Ser Arg 
370 375 380 

Asp Gin Glu Cys Asp Phe Arg Arg Arg Arg Arg Pro Ala Cys Gin Leu 
385 390 395 400 

He Arg Leu Pro Ala Pro Gly Arg Asp Ser Gin Gly Lys Gly His Gin 
405 410 415 

Ser Gin Pro Leu Pro Ser Gin Arg Gly Arg Gin He Tyr Val Ala Gly 
420 425 430 

Gin Arg Ser Ser Tyr Leu Pro Ser Glu Leu Val Ala Ala Phe -eu Trp 
435 440 445 

Ala Gin Phe Glu Glu Ala Glu Arg He Thr Arg He Arg Leu Xsd Leu 
450 455 460 

Trp Asn Arg Tyr His Glu Ser Phe Glu Ser Leu Glu Gin Arg Glv Leu 
465 470 475 " 480 

Leu Arg Arg Pro He He Pro Gin Gly Cys Ser His Asn Ala His Met 
485 490 

Tyr Tyr Val Leu Leu Ala Pro Ser Ala Asp Arg Glu Glu Val Leu Ala 
500 505 5;i^Q 

Arg Leu Thr Ser Glu Gly He Gly Ala Val Phe His Tyr Val Pro Leu 
515 520 525 

His Asp Ser Pro Ala Gly Arg Arg 
530 535 

INFORMATION FOR SEQ ID NO: 157: 

'i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 84 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

Asn Glu Ser Ala Pro Arg Ser Pro Met Leu Pro Ser Ala Arg Pro Arg 

^5 10 15 

Tyr Asp Ala He Ala Val Leu Leu Asn Glu Met His Ala Gly His Cys 
20 25 30 



Asp Phe Gly Leu Val Gly Pro Ala Pro Asp He Val Thr Asp Ala Ala 
35 40 45 
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Gly Asp Asp Arg Ala 
50 

Gly Phe Leu Glu Pro 
65 

Gly Gly Leu Thr Val 
85 

Ala Thr Val Leu Ala 
100 

Phe Leu Val Ala Glu 
115 

Asp Lys Asp Val Val 

13 0 

Glu Thr Leu Glu His 
145 

His Arg Gly Asp Asp 
165 

Ala Met Leu Val Ser 
180 

Gin His Gin Tyr His 
195 

Glu Gin Lys Val Ser 
210 



Gly Leu Gly Val Asp Glu 
55 

Ala Pro Val Leu Val Asd 
70 75' 

Asp Trp Lys Val Ser Trp 
90 

Ala Val His Glu Trp Pro 
105 

Leu Ser Gin Asp Arg Pro 
120 

Leu Gin Arg His Trp Leu 
135 

Thr Pro His Gly Arg Arg 
150 155 

Arg Phe His Glu Arg Asp 
170 

Pro Val Glu Ala Glu Arg 
185 

Val Val Ala Glu Val Glu 
200 

Leu Leu Ala lie Ala lie 
215 



Gin Phe Arg His Val 
60 

Gin Arg Asp Asp Leu 
80 

Pro Arg Gin Arg Gly 
95 

Pro lie Val Val His 
110 

Gly Gin His Pro Phe 
125 

Ala Leu Arg Arg Ser 
140 

Pro Val Arg Pro Arg 
160 

Pro Leu His Ser Val 
175 

Arg Ala Pro Val Val 
190 

Arg lie Pro Glu Arg 
205 

Ala Val Gly Ser Arg 
220 



Trp Ala Glu Leu Val Arg Arg Ala His Pro Asp Gin lie Ala Gly His 
-25 230 235 ' 240 

Gin Pro Ala Gin Pro Phe Gin Val Arg His Asp Val Ala Pro Gin Val 

250 255 

-Arg .Arg Arg Gly Val Ala Val Leu Lys Asp Asp Gly Val Thr Leu Ala 
260 265 270 

Phe Val Asp lie Arg His Ala Leu Pro Gly Asp Phe 

275 280 



:NFCRMATI0N for SEQ id M0:158: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 264 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(Dl TOPOLOGY: linear 



XI) SEQUENCE DESCRIPTION: SEQ ID NO: 15 8 
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ATGAACATGT CGTCGGTGGT GGGTCGCAAG GCCTTTGCGC GATTCGCCGG CTACTCCTCC 60 

GCCATGCACG CGATCGCCGG TTTCTCCGAT GCGTTGCGCC AAGAGCTGCG GGGTAGCGGA 120 

ATCGCCGTCT CGGTGATCCA CCCGGCGCTG ACCCAGACAC CGCTGTTGGC CAACGTCGAC 180 

CCCGCCGACA TGCCGCCGCC GTTTCGCAGC CTCACGCCCA TTCCCGTTCA CTGGGTCGCG 24 0 

GCAGCGGTGC TTGACGGTGT GGCG 264 
(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1171 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

TAGTCGGCGA CGATGACGTC GCGGTCCAGG CCGACCGCTT CAAGCACCAG CGCGACCACG 60 

AAGCCGGTGC GATCCTTACC CGCGAAGCAG TGGGTGAGCA CCGGGCGTCC GGCGGCAAGC 120 

AGTGTGACGA CACGATGTAG CGCGCGCTGT GCTCCATTGC GCGTTGGGAA TTGGCGATAC 180 

TCGTCGGTCA TGTAGCGGGT GGCCGCGTCA TTTATCGACT GGCTGGATTC GCCGGACTCG 240 

CCGTTGGACC CGTCATTGGT TAGCAGCCTC TTGAATGCGG TTTCGTGCGG CGCTGAGTCG 30 0 

TCGGCGTCAT CATCGGCGAG 3TCGGGGAAC GGCAGCAGGT GGACGTCGAT GCCGTCCGGA 360 

ACCCGTCCTG GACCGCGGCG GGCAACCTCC CGGGACGACC GCAGGTCGGC AACGTCGGTG 42 0 

ATCCCCAGCC GGCGCAGCGT TGCCCCTCGT GCCGAATTCG GCACGAGGCT GGCGAGCCAC 480 

CGGGCATCAC CAAGCAACGC TTGCCCAGTA CGGATCGTCA CTTCCGCATC CGGCAGACCA 54 0 

ATCTCCTCGC CGCCCATCGT CAGATCCCGC TCGTGCGTTG ACAAGAACGG CCGCAGATGT 500 

GCCAGCGGGT ATCGGAGATT GAACCGCGCA CGCAGTTCTT CAATCGCTGC GCGCTGCCGC 660- 

ACTATTGGCA CTTTCCGGCG GTCGCGGTAT TCAGCAAGCA TGCGAGTCTC GACGAACTCG 720 

CCCCACGTAA CCCACGGCGT AGCTCCCGGG GTGACGCGGA GGATCGGCGG GTGATCTTTG 7 80 

CCGCCACGCT CGTAGCCGT7 GATCCACCGC TTCGCGGTGC CGGCGGGGAG GCCGATCAGC 84 0 

TTATCGACCT CGGCGTATGC CGACGGCAAG CTGGGCGCGT TCGTCGAGGT CAAGAACTCC 90 0 

ACCATCGGCA CCGGCACCAA GGTGCCGCAC CTGACCTACG TCGGCGACGC CGACATCGGC 960 

GAGTACAGCA ACATCGGCGC CTCCAGCGTG TTCGTCAACT ACGACGGTAC GTCCAAACGG 102 0 



wo 99/42118 



110 



PCT/US99/03265 



CGCACCACCG TCGGTTCGCA CGTACGGACC GGGTCCGACA CCATGTTCGT GGCCCCAGTA 1080 

ACCATCGGCG ACGGCGCGTA TACCGGGGCC GGCACAGTGG TGCGGGAGGA TGTCCCGCCG 1140 

GGGGCGCTGG CAGTGTCGGC GGGTCCGCAA C 1171 
(2) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 227 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 

GCAAAGGCGG CACCGGCGGG GCCGGCATGA ACAGCCTCGA CCCGCTGCTA GCCGCCCAAG 60 

ACGGCGGCCA AGGCGGCACC GGCGGCACCG GCGGCAACGC CGGCGCCGGC GGCACCAGCT 120 

TCACCCAAGG CGCCGACGGC AACGCCGGCA ACGGCGGTGA CGGCGGGGTC GGCGGCAACG 180 

GCGGAAACGG CGGAAACGGC GCAGACAACA CCACCACCGC CGCCGCC 227 
(2) INFORMATION FOR SEQ ID NO: 161; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 304 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

:xi} SEQUENCE DESCRIPTION: SEQ ID NO: 161: 

CCTCGCCACC ATGGGCGGGC AGGGCGGTAG CGGTGGCGCC GGCTCTACCC CAGGCGCCAA 60 

GGGCGCCCAC GGCTTCACTC CAACCAGCGG CGGCGACGGC GGCGACGGCG GCAACGGCGG 120 

CAACTCCCAA GTGGTCGGCG GCAACGGCGG CGACGGCGGC AATGGCGGCA ACGGCGGCAG 180 

CGCCGGCACG GGCGGCAACG GCGGCCGCGG CGGCGACGGC GCGTTTGGTG GCATGAGTGC 240 

CAACGCCACC AACCCTGGTG AAAACGGGCC AAACGGTAAC CCCGGCGGCA ACGGTGGCGC 3 00 

CGGC 304 
(2) INFORMATION FOR SEQ ID NO : 162 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(XI) SEQUENCZ DESCRIPTION: SEQ ID NO: 162: 
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GTGGGACGCT GCCGAGGCTG TATAACAAGG ACAACATCGA CCAGCGCCGG CTCGGTGAGC 60 

TGATCGACCT ATTTAACAGT GCGCGCTTCA GCCGGCAGGG CGAGCACCGC GCCCGGGATC 120 

TGATGGGTGA GGTCTACGAA TACTTCCTCG GCAATTTCGC TCGCGCGGAA GGGAAGCGGG 180 

GTGGCGAGTT CTTTACCCCG CCCAGCGTGG TCAAGGTGAT CGTGGAGGTG CTGGAGCCGT 2 40 

CGAGTGGGCG GGTGTATGAC CCGTGCTGCG GTTCCGGAGG CATGTTTGTG CAGACCGAGA 300 

AGTTCATCTA CGAACACGAC GGCGATCCGA AGGATGTCTC GATCTATGGC CAGGAAAGCA 3 60 

TTGAGGAGAC CTGGCGGATG GCGAAGATGA ACCTCGCCAT CCACGGCATC GACAACAAGG 420 

GGCTCGGCGC CCGATGGAGT GATACCTTCG CCCGCGACCA GCACCCGGAC GTGCAGATGG 4 80 

ACTACGTGAT GGCCAATCCG CCGTTCAACA TCAAAGACTG GGCCCGCAAC GAGGAAGACC 540 

CACGCTGGCG CTTCGGTGTT CCGCCCGCCA ATAACGCCAA CTACGCATGG ATTCAGCACA 600 

TCCTGTACAA CTTGGCGCCG GGAGGTCGGG CGGGCGTGGT GATGGCCAAC GGGTCGATGT 660 

CGTCGAACTC CAACGGCAAG GGGGATATTC GCGCGCAAAT CGTGGAGGCG GATTTGGTTT 72 0 

CCTGCATGGT CGCGTTACCC ACCCAGCTGT TCCGCAGCAC CGGAATCCCG GTGTGCCTGT 7 80 

GGTTTTTCGC CAAAAACAAG GCGGCAGGTA AGCAAGGGTC TATCAACCGG TGCGGGCAGG 840 

TGCTGTTCAT CGACGCTCGT GAACTGGGCG ACCTAGTGGA CCGGGCCGAG CGGGCGCTGA 900 

CCAACGAGGA GATCGTCCGC ATCGGGGATA CCTTCCACGC GAGCACGACC ACCGGCAACG 960 

CCGGCrcCGG TGGTGCCGGC GGTAATGGGG GCACTGGCCT CAACGGCGCG GGCGGTGCTG 1020 

GCGGGGCCGG CGGCAACGCG GGTGTCGCCG GCGTGTCCTT CGGCAACGCT GTGGGCGGCG 108 0 

ACGGCGGCAA CGGCGGCAAC GGCGGCCACG GCGGCGACGG CACGACGGGC GGCGCCGGCG 114 0 

GCAAGGGCGG CAACGGCAGC AGCGGTGCCG CCAGCGGCTC AGGCGTCGTC AACG7CACCG 12 0 0 

CCGGCCACGG CGGCAACGGC GGCAATGGCG GCAACGGCGG CAACGGCTCC GCGGGCGCCG 12 60 

GCGGCCAGGG CGGTGCCGGC GGCAGCGCCG GCAACGGCGG CCACGGCGGC GGTGCCACCG 13 2 0 

GCGGCGCCAG CGGCAAGGGC GGCAACGGCA CCAGCGGTGC CGCCAGCGGC TCAGGCGTCA 13 3 0 

TCAACGTCAC CGCCOGCCAC GGCGGCAACG GCGGCAATGG CCGCAACGGC GGCAACGGC 143 9 
(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 329 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRXPTION: SEQ ID fJO:163: 



GGGCCGGCGG GGCCGGATTT TCTCGTGCCT TGATTGTCGC TGGGGATAAC GGCGGTGATG 



60 



GTGGTAACGG CGGGATGGGC GGGGCTGGCG GGGCTGGCGG CCCCGGCGGG GCCGGCGGCC 



120 



TGATCAGCCT GCTGGGCGGC CAAGGCGCCG GCGGGGCCGG CGGGACCGGC GGGGCCGGCG 



180 



GTGTTGGCGG TGACGGCGGG GCCGGCGGCC CCGGCAACCA GGCCTTCAAC GCAGGTGCCG 



240 



GCGGGGCCGG CGGCCTGATC AGCCTGCTGG GCGGCCAAGG CGCCGGCGGG GCCGGCGGGA 



300 



CCGGCGGGGC CGGCGGTGTT GGCGGTGAC 



329 



(2) INFORMATION FOR SEQ ID NO : 164 : 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 
GCAACGGTGG CAACGGCGGC ACCAGCACGA CCGTGGGGAT GGCCGGAGGT AACTGTGGTG 6 0 

CCGCCGGGCT GATCGGCAAC 8 0 

(2) INFORMATION FOR SEQ ID >rO:165: 

'i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 92 base pairs 

(B) TYPE: nucleic acid 
\C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

:xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 

GGGCTGTGTC GCACTCACAC CGCCGCATTC GGCGACGTTG GCCGCCCAAT ATCCAGCTCA 60 

AGGCCTACTA CTTACCGTCG GAGGACCGCC GCATCAAGGT GCGGGTCAGC GCCCAAGGAA 12 0 

TCAAGGTCAT CGACCGCGAC GGGCATCGAG GCCGTCGTCG CGCGGCTCGG GCAGGATCCG 18 0 

CCCCGGCGCA CTTCGCGCGC CAAGCGGGCT CATCGCTCCG AACGGCGGCG ATCCTGTGAG 24 0 

CACAACTGAT GGCGCGCAAC GAGATTCGTC CAATTGTCAA GCCGTGTTCG ACCGCAGGGA 3 00 

CCGGTTATAC GTATGTCAAC CTATGTCACT CGCAAGAACC GGCATAACGA TCCCGTGATC 36 0 

C3CCGACAGC CCACGAGTGC AAGACCGTTA CA 3 92 

C; INFORMATION FOR SEQ ID NO: 166: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 535 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

ACCGGCGCCA CCGGCGGCAC CGGGTTCGCC GGTGGCGCCG GCGGGGCCGG CGGGCAGGGC 60 

GGTATCAGCG GTGCCGGCGG CACCAACGGC TCTGGTGGCG CTGGCGGCAC CGGCGGACAA 120 

GGCGGCGCCG GGGGCGCTGG CGGGGCCGGC GCCGATAACC CCACCGGCAT CGGCGGCGCC 180 

GGCGGCACCG GCGGCACCGG CGGAGCGGCC GGAGCCGGCG GGGCCGGTGG CGCCATCGGT 240 

ACCGGCGGCA CCGGCGGCGC GGTGGGCAGC GTCGGTAACG CCGGGATCGG CGGTACCGGC 3 00 

GGTACGGGTG GTGTCGGTGG TGCTGGTGGT GCAGGTGCGG CTGCGGCCGC TGGCAGCAGC 360 

GCTACCGGTG GCGCCGGGTT CGCCGGCGGC GCCGGCGGAG AAGGCGGACC GGGCGGCAAC 420 

AGCGGTGTGG GCGGCACCAA CGGCTCCGGC GGCGCCGGCG GTGCAGGCGG CAAGGGCGGC 480 

ACCGGAGGTG CCGGCGGGTC CGGCGCGGAC AACCCCACCG GTGCTGGTTT CGCCG 535 
(2) INFORMATION FOR SEQ ID NO: 167: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 690 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 

CCGACGTCGC CGGGGCGATA CGGGGGTCAC CGACTACTAC ATCATCCGCA CCGAGAATCG 60 

GCCGCTGCTG CAACCGCTGC GGGCGGTGCC GGTCATCGGA GATCCGCTGG CCGACCTGAT 12 0 

CCAGCCGAAC CTGAAGGTGA TCGTCAACCT GGGCTACGGC GACCCGAACT ACGGCTACTC 18 a 

GACGAGCTAC GCCGATGTGC GAACGCCGTT CGGGCTGTGG CCGAACGTGC CGCCTCAGGT 24 0 

CATCGCCGAT GCCCTGGCCG CCGGAACACA AGAAGGCATC CTTGACTTCA CGGCCGACCT 300 

GCAGGCGCTG TCCGCGCAAC CGCTCACGCT CCCGCAGATC CAGCTGCCGC AACCCGCCGA 3 60 

TCTGGTGGCG GCGGTGGCCG CCGCACCGAC GCCGGCCGAG GTGGTGAACA CGCTCGCCAG 42 0 

GATCATCTCA ACCAACTACG CCGTCCTGCT GCCCACCGTG GACATCGCCC TCGCCTGGTC 4 80 

ACCACCCTGC CGCTGTACAC CACCCAACTG TTCGTCAGGC AACTCGCTGC GGGCAATCTG 54 0 
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ATCAACGCGA TCGGCTATCC CCTGGCGGCC ACCGTAGGTT TAGGCACGAT CGATAGCGGG 600 

CGGCGTGGAA TTGCTCACCC TCCTCGCGGC GGCCTCGGAC ACCGTTCGAA ACATCGAGGG 660 

CCTCGTCACC TAACGGATTC CCGACGGCAT 690 
(2) INFORMATION FOR SEQ ID NO: 168: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 07 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D ) TOPOLOGY : linear 

(xi) SEQtJENCE DESCRIPTION: SEQ ID NO: 168: 

ACGGTGACGG CGGTACTGGC GGCGGCCACG GCGGCAACGG CGGGAATCCC GGGTGGCTCT 60 

TGGGCACAGC CGGGGGTGGC GGCAACGGTG GCGCCGGCAG CACCGGTACT GCAGGTGGCG 120 

GCTCTGGGGG CACCGGCGGC GACGGCGGGA CCGGCGGGCG TGGCGGCCTG TTAATGGGCG 180 

CCGGCGCCGG CGGGCACGGT GGCACTGGCG GCGCGGGCGG TGCCGGTGTC GACGGTGGCG 240 

GCGCCGGCGG GGCCGGCGGG GCCGGCGGCA ACGGCGGCGC CGGGGGTCAA GCCGCCCTGC 300 

TGTTCGGGCG CGGCGGCACC GGCGGAGCCG GCGGCTACGG CGGCGATGGC GGTGGCGGCG 360 

GTGACGGCTT CGACGGCACG ATGGCCGGCC TGGGTGGTAC CGGTGGC 407 
(2) INFORMATION FOR SEQ ID NO: 169: 

;i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 468 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 

GATCGGTCAG CGCATCGCCC TCGGCGGCAA GCGATTCCGC GGTCTCACCG AAGAACATCG 6 0 

TGCACGCGGC GGCGCGGACC AGCCCGCTGC GCTGCGGCGC GTCGAACGCC TCCAGCAGGC 120 

ACAGCCAGTC CTTGGCGGCC TGCGAGGCGA ACACGTCGGT GTCACCGGTG TAGATCGCCG 18 0 

GGATGCCCGC CTCCGCCAAC GCATTCCGGC ACGCCCGCGC GTCTTTGTGA TGCTCGACGA 24 0 

TCACCGCGAT GTCTGCGGCC ACCACGGGCC GCCCGGCGAA GGTGGCCCCG CTGGCCAGTA 3 00 

GCGCCGCGAC GTCGGCGGCC AGGTCGTCGG GGATGTGCCG GCGCAGCGCT CCGGCGCGAC 36 0 

GCCCGAAAAA CGACCCCTCA CCCAGCTGGG TCCCGCTGGC ATATCCCTTG CCGTCCTGGG 42 0 

CGATATTGGA CGCGCATGCC CCGACCGCGT ACAGGCCGGC CACCACCG 46 8 
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(2) INFORMATION FOR SEQ ID NO: 170: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH : 219 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 
GGTGGTAACG GCGGCCAGGG TGGCATCGGC GGCGCCGGCG AGAGAGGCGC CGACGGCGCC 
GGCCCCAATG CTAACGGCGC AAACGGCGAG AACGGCGGTA GCGGTGGTAA CGGTGGCGAC 
GGCGGCGCCG GCGGCAATGG CGGCGCGGGC GGCAACGCGC AGGCGGCCGG GTACACCGAC 
GGCGCCACGG GCACCGGCGG CGACGGCGGC AACGGCGGC 
(2) INFORMATION FOR SEQ ID NO: 171: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 494 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 171 : 



TAGCTCCGGC 


GAGGGCGGCA 


AGGGCGGCGA 


CGGTGGCCAC 


GGCGGTGACG 


GCGTCGGCGG 


60 


CAACAGTTCC 


GTCACCCAAG 


GCGGCAGCGG 


CGGTGGCGGC 


GGCGCCGGCG 


GCGCCGGCGG 


120 


CAGCGGCTTT 


TTCGGCGGCA 


AGGGCGGCTT 


GGGCGGCGAC 


GGCGGTCAGG 


GCGGCCCCAA 


180 


CGGCGGCGGT 


ACCGTCGGCA 


CCGTGGCCGG 


TGGCGGCGGC 


AACGGCGGTG 


TCGGCGGCCG 


240 


GGGCGGCGAC 


GGCGTCTTTG 


CCGGTGCCGG 


CGGCCAGGGC 


GGCCTCGGTG 


GGCAGGGCGG 


300 


CAATGGCGGC 


GGCTCCACCG 


GCGGCAACGG 


CGGCCTTGGC 


GGCGCCGGCG 


GTGGCGGAGG 


360 


CAACGCCCCG 


GCTCGTGCCG 


AATCCGGGCT 


GACCATGGAC 


AGCGCGGCCA 


AGTTCGCTGC 


420 


CATCGCATCA 


GGCGCGTACT 


GCCCCGAACA 


CCTGGAACAT 


CACCCGAGTT 


AGCGGGGCGC 


480 


ATTTCCTGAT 


CACC 










4 94 



(2) INFORMATION FOR SEQ ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 220 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 
GGGCCGGTGG TGCCGCGGGC CAGCTCTTCA GCGCCGGAGG CGCGGCGGGT GCCGTTGGGG 
TTGGCGGCAC CGGCGGCCAG GGTGGGGCTG GCGGTGCCGG AGCGGCCGGC GCCGACGCCC 
CCGCCAGCAC AGGTCTAACC GGTGGTACCG GGTTCGCTGG CGGGGCCGGC GGCGTCGGCG 
GCCAGAGCGG CAACGCCATT GCCGGCGGCA TCAACGGCTC 
(2) INFORMATION FOR SEQ ID NO: 173: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 388 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:173: 
ATGGCGGCAA CGGGGGCCCC GGCGGTGCTG GCGGGGCCGG CGACTACAAT TTCCAACGGC 
GGGCAGGGTG GTGCCGGCGG CCAAGGCGGC CAAGGCGGCC TGGGCGGGGC AAGCACCACC 
TGATCGGCCT AGCCGCACCC GGGAAAGCCG ATCCAACAGG CGACGATGCC GCCTTCCTTG 
CCGCGTTGGA CCAGGCCGGC ATCACCTACG CTGACCCAGG CCACGCCATA ACGGCCGCCA 
AGGCGATGTG TGGGCTGTGT GCTAACGGCG TAACAGGTCT ACAGCTGGTC GCGGACCTGC 
GGGACTACAA TCCCGGGCTG ACCATGGACA GCGCGGCCAA GTTCGCTGCC ATCGCATCAG 
GCGCGTACTG CCCCGAACAC CTGGAACA 
(2) INFORMATION FOR SEQ ID NO: 174: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 400 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

tXl) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 
GCAAAGGCGG CACCGGCGGG GCCGGCATGA ACAGCCTCGA CCCGCTGCTA GCCGCCCAAG 
ACGGCGGCCA AGGCGGCACC GGCGGCACCG GCGGCAACGC CGGGGCCGGC GGCACCAGCT 
TCACCCAAGG CGCCGACGGC AACGCCGGCA ACGGCGGTGA CGGCGGGGTC GGCGGCAACG 
GCGGAAACGG CGGAAACGGC GCAGACAACA CCACCACCGC CGGGGCCGGC ACCACAGGCG 
GCGACGGCGG GGCCGGCGGG GCCGGCGGAA CCGGCGGAAC CGGCGGAGCC GCCGGCACCG 
GCACCGGCGG CCAACAAGGC AACGGCGGCA ACGGCGGCAC CGGCGGCAAA GGCGGCACCG 



60 
120 
180 
220 



60 
120 
180 
240 
300 
360 

388 



60 
120 
180 
240 
300 
360 
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GCGGCGACGG TGCACTCTCA GGCAGCACCG GTGGTGCCGG 
(2) INFORMATION FOR SEQ XD NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 538 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 
GGCAACGGCG GCAACGGCGG CATCGCCGGC AITGGGCGGC AACGGCGTTC CGGGACGGGC 
AGCGGCAACG GCGGCCAACG GCGGCAGCGG CGGCAACGGC GGCAACGCCG GCATGGGCGG 
CAACAGCGGC ACCGGCAGCG GCGACGGCGG TGCCGGCGGG AACGGCGGCG CGGCGGGCAC 
GGGCGGCACC GGCGGCGACG GCGGCCTCAC CGGTACTGGC GGCACCGGCG GCAGCGGTGG 
CACCGGCGGT GACGGCGGTA ACGGCGGCAA CGGAGCAGAT AACACCGCAA ACATGACTGC 
GCAGGCGGGC GGTGACGGTG GCAACGGCGG CGACGGTGGC TTCGGCGGCG GGGCCGGGGC 
CGGCGGCGGT GGCTTGACCG CTGGCGCCAA CGGCACCGGC GGGCAAGGCG GCGCCGGCGG 
CGATGGCGGC AACGGGGCCA TCGGCGGCCA CGGCCCACTC ACTGACGACC CCGGCGGCAA 
CGGGGGCACC GGCGGCAACG GCGGCACCGG CGGCACCGGC GGCGCGGGCA TCGGCAGC 
(2) INFORMATION FOR SEQ ZD NO: 176: 

[i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 239 base pairs 

(B) TYPE: nucleic acid 
(C; STRANDEDNESS: single 
iD) TOPOLOGY: Ixnear 

ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 6: 

GGGCCGGTGG TGCCGCGGGC CAGCTCTTCA GCGCCGGAGG CGCGGCGGGT GCCGTTGGGG 

TTGGCGGCAC CGGCGGCCAG GGTGGGGCTG GCGGTGCCGG AGCGGCCGGC GCCGACGCCC 

CCGCCAGCAC AGGTCTAACC GGTGGTACCG GGTTCGCTGG CGGGGCCGGC GGCGTCGGCG 

GCCACGGCGG CAACGCCATT GCCGGCGGCA TCAACGGCTC CGGTGGTGCC GGCGGCACC 

(2) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 985 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 



60 
120 
180 
240 
300 
360 
420 
480 
538 



50> 
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(D) TOPOU}GY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 177 : 

AGCAGCGCTA CCGGTGGCGC CGGGTTCGCC GGCGGCGCCG GCGGAGAAGG CGGAGCGGGC 60 

GGCAACAGCG GTGTGGGCGG CACCAACGGC TCCGGCGGCG CCGGCGGTGC AGGCGGCAAG 120 

GGCGGCACCG GAGGTGCCGG CGGGTCCGGC GCGGACAACC CCACCGGTGC TGGTTTCGCC 180 

GGTGGCGCCG GCGGCACAGG TGGCGCGGCC GGCGCCGGCG GGGCCGGCGG GGCGACCGGT 240 

ACCGGCGGCA CCGGCGGCGT TGTCGGCGCC ACCGGTAGTG CAGGCATCGG CGGGGCCGGC 3 00 

GGCCGCGGCG GTGACGGCGG CGATGGGGCC AGCGGTCTCG GCCTGGGCCT CTCCGGCTTT 3 60 

GACGGCGGCC AAGGCGGCCA AGGCGGGGCC GGCGGCAGCG CCGGCGCCGG CGGCATCAAC 420 

GGGGCCGGCG GGGCCGGCGG CAACGGCGGC GACGGCGGGG ACGGCGCAAC CGGTGCCGCA 4 80 

GGTCTCGGCG ACAACGGCGG GGTCGGCGGT GACGGTGGGG CCGGTGGCGC CGCCGGCAAC 540 

GGCGGCAACG CGGGCGTCGG CCTGACAGCC AAGGCCGGCG ACGGCGGCGC CGCGGGCAAT 600 

GGCGGCAACG GGGGCGCCGG CGGTGCTGGC GGGGCCGGCG ACAACAATTT CAACGGCGGC 660 

CAGGGTGGTG CCGGCGGCCA AGGCGGCCAA GGCGGCTTGG GCGGGGCAAG CACCACCTGA 720 

TCGGCCTAGC CGCACCCGGG AAAGCCGATC CAACAGGCGA CGATGCCGCC TTCCTTGCCG 780 

C3TTGGACCA GGCCGGCATC ACCTACGCTG ACCCAGGCCA CGCCATAACG GCCGCCAAGG 840 

CGATGTGTGG 3CTGTGTGCT AACGGCGTAA CAGGTCTACA GCTGGTCGCG GACCTGCGGG 90 0 

AATACAATCC CGGGCTGACC ATGGACAGCG CGGCCAAGTT CGCTGCCATC GCATCAGGCG 96 0 

CGTACTGCCC CGAACACCTG GAACA 98 5 
(2) INFORMATION FOR SEQ ID NO: 176: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2138 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 178 : 

CGGCACGAGG ATCGGTACCC CGCGGCATCG GCAGCTGCCG ATTCGCCGGG TTTCCCCACC 60 

CGAGGAAAGC CGCTACCAGA TGGCGCTGCC GAAGTAGGGC GATCCGTTCG CGATGCCGGC 120 

ATGAACGGGC GGCATCAAAT TAGTGCAGGA ACCTTTCAGT TTAGCGACGA TAATGGCTAT 18 0 

AGCACTAAGG AGGATGATCC GATATGACGC AGTCGCAGAC CGTGACGGTG GATCAGCAAG 24 0 
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AGATTTTGAA CAGGGCCAAC GAGGTGGAGG CCCCGATGGC GGACCCACCG ACTGATGTCC 300 

CCATCACACC GTGCGAACTC ACGGCGGCTA AAAACGCCGC CCAACAGCTG GTATTGTCCG 360 

CCGACAACAT GCGGGAATAC CTGGCGGCCG GTGCCAAAGA GCGGCAGCGT CTGGCGACCT 420 

CGCTGCGCAA CGCGGCCAAG GCGTATGGCG AGGTTGATGA GGAGGCTGCG ACCGCGCTGG 480 

ACAACGACGG CGAAGGAACT GTGCAGGCAG AATCGGCCGG GGCCGTCGGA GGGGACAGTT 540 

CGGCCGAACT AACCGATACG CCGAGGGTGG CCACGGCCGG TGAACCCAAC TTCATGGATC 600 

TCAAAGAAGC GGCAAGGAAG CTCGAAACGG GCGACCAAGG CGCATCGCTC GCGCACTTTG 660 

CGGATGGGTG GAACACTTTC AACCTGACGC TGCAAGGCGA CGTCAAGCGG TTCCGGGGGT 720 

TTGACAACTG GGAAGGCGAT GCGGCTACCG CTTGCGAGGC TTCGCTCGAT CAACAAC3GC 780 

AATGGATACT CCACATGGCC AAATTGAGCG CTGCGATGGC CAAGCAGGCT CAATATGTCG 840 

CGCAGCTGCA CGTGTGGGCT AGGCGGGAAC ATCCGACTTA TGAAGACATA GTCGGGCTCG 900 

AACGGCTTTA CGCGGAAAAC CCTTCGGCCC GCGACCAAAT TCTCCCGGTG TACGCGGAGT 960 

ATCAGCAGAG GTCGGAGAAG GTGCTGACCG AATACAACAA CAAGGCAGCC CTGGAACCGG 1020 

TAAACCCGCC GAAGCCTCCC CCCGCCATCA AGATCGACCC GCCCCCGCCT CCGCAAGAGC 1080 

AGGGATTGAT CCCTGGCTTC CTGATGCCGC CGTC7GACGG CTCCGGTGTG ACTCCCGGTA 114 0 

CCGGGATGCC AGCCGCACCG ATGGTTCCGC CTACCGGATC GCCGGGTGGT GGCCTCCCGG 1200 

CTGACACGGC GGCGCAGCTG ACGTCGGCTG GGCGGGAAGC CGCAGCGCTG TCGGGCGACG 1260 

TGGCGGTCAA AGCGGCATCG CTCGGTGGCG GTGGAGGCGG CGGGGTGCCG TCGGCGCCGT 1320 

TGGGATCCGC GATCGGGGGC GCCGAATCGG TGCGGCCCGC TGGCGCTGGT GACATTGCCG 13 80 

GCTTAGGCCA GGGAAGGGCC GGCGGCGGCG CCGCGCTGGG CGGCGGTGGC ATGGGAATGC 144 0 

CGATGGGTGC CGCGCATCAG GGACAAGGGG GCGCCAAGTC CAAGGGTTCT CAGCAGGAAG 1500 

ACGAGGCGCT CTACACCGAG GATCGGGCAT GGACCGAGGC CGTCATTGGT AACCGTCGGC 1560 

GCCAGGACAG TAAGGAGTCG AAGTGAGCAT GGACGAATTG GACCCGCATG TCGCCCGGGC 1520 

GTTGACGCTG GCGGCGCGGT TTCAGTCGGC CCTAGACGGG ACGCTCAATC AGATGAACAA 168 0 

CGGATCCrrC CGCGCCACCG ACGAAGCCGA GACCGTCGAA GTGACGATCA ATGGGCACCA 174 0 

GTGGCTCACC GGCCTGCGCA TCGAAGATGG TTTGCTGAAG AAGCTGGGTG CCGAGGCGGT 1800 

GGCTCAGCGG GTCAACGAGG CGCTGCACAA TGCGCAGGCC GCGGCGTCCG CGTATAACGA 186 0 
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CGCGGCGGGC GAGCAGCTGA CCGCTGCGTT ATCGGCCATG TCCCGCGCGA TGAACGAAGG 1920 

AATGGCCTAA GCCCArTGTT GCGGTGGTAG CGACTACGCA CCGAATGAGC GCCGCAATGC 1980 

GGTCATTCAG CGCGCCCGAC ACGGCGTGAG TACGCATTGT CAATGTTTTG ACATGGATCG 2040 

GCCGGGTTCG GAGGGCGCCA TAGTCCTGGT CGCCAATATT GCCGCAGCTA GCTGGTCTTA 2100 
GGTTCGGTTA CGCTGGTTAA TTATGACGTC CGTTACCA 
(2) INFORMATION FOR SEQ ID MO: 179: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 460 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:179: 

Met Thr Gin Ser Gin Thr Val Thr Val Asp Gin Gin Glu He Leu Asn 
^5 10 15 

Arg Ala Asn Glu Val Glu Ala Pro Met Ala Asp Pro Pro Thr Asd Val 
20 25 30 ■ 

Pro He Thr Pro Cys Glu Leu Thr Ala Ala Lys Asn Ala Ala Gin Gin 
35 40 45 

Leu Val Leu Ser Ala Asp Asn Met Arg Glu Tvr Leu Ala Ala Gly Ala 
5° 55 

Lys Glu Arg Gin Arg Leu Ala Thr Ser Leu Arg Asn Ala Ala Lvs Ala 
" '° 75 ■ 80 

Tyr Gly Glu Val Asp Glu Glu Ala Ala Thr Ala Leu Asp Asn Asp Gly 
35 90 95 

Glu Gly Thr Val Gin Ala Glu Ser Ala Gly Ala Val Gly Gly Asp Ser 

105 110 

ser Ala Glu Leu Thr Asp Thr Pro Arg Val Ala Thr Ala Gly Glu Pro 
115 120 125 

Asn Phe Met Asp Leu Lys Glu Ala Ala Arg Lys Leu Glu Thr Gly Asp 



130 135 



140 



Gin Gly Ala Ser Leu Ala His Phe Ala Asp Gly Trp Asn Thr Phe Asn 
145 150 ]_5s 



160 



Leu Thr Leu Gin Gly Asp Val Lys Arg Phe Arg Gly Phe Asp Asn Trp 
1S3 170 175 

Glu Gly Asp Ala Ala Thr Ala Cys Glu Ala Ser Leu Asn Gin Gin Arg 

ISO 185 190 



wo 99/42118 



121 



PCT/US99/03265 



Gin Trp lie Leu His Met Ala Lys Leu Ser Ala Ala Met Ala Lys Gin 
195 200 20S 

Ala Gin Tyr Val Ala Gin Leu His Val Trp Ala Arg Arg Glu His Pro 
210 215 220 

Thr Tyr Glu Asp lie Val Gly Leu Glu Arg Leu Tyr Ala Glu Asn Pro 
225 230 235 240 

Ser Ala Arg Asp Gin He Leu Pro Val Tyr Ala Glu Tyr Gin Gin Arg 
245 250 255 

Ser Glu Lys Val Leu Thr Glu Tyr Asn Asn Lys Ala Ala Leu Glu Pro 
260 265 270 

Val Asn Pro Pro Lys Pro Pro Pro Ala He Lys He Asp Pro Pro Pro 
275 280 285 

Pro Pro Gin Glu Gin Gly Leu He Pro Gly Phe Leu Met Pro Pro Ser 
290 295 300 

Asp Gly Ser Gly Val Thr Pro Gly Thr Gly Met Pro Ala Ala Pro Met 
305 310 315 320 

Val Pro Pro Thr Gly Ser Pro Gly Gly Gly Leu Pro Ala Asp Thr Ala 
325 330 335 

Ala Gin Leu Thr Ser Ala Gly Arg Glu Ala Ala Ala Leu Ser Gly Asp 
340 345 350 

Val Ala Val Lys Ala Ala Ser Leu Gly Gly Gly Gly Gly Gly Gly Val 
355 360 365 

Pro Ser .Ala Pro Leu Gly Ser Ala He Gly Gly Ala Glu Ser Val Arg 
370 375 380 

Pro Ala Gly Ala Gly Asp He Ala Gly Leu Gly Gin Gly Arg Ala Gly 
385 390 395 400 

Gly Gly Ala Ala Leu Gly Gly Gly Gly Met Gly Met Pro Met Gly Ala 
405 410 415 

Ala His Gin Gly Gin Gly Gly Ala Lys Ser Lys Gly Ser Gin Gin Glu 
420 425 430 

Asp Glu Ala Leu Tyr Thr Glu Asp Arg Ala Trp Thr Glu Ala Val He 
435 440 445 

Gly Asn Arg Arg Arg Gin Asp Ser Lys Glu Ser Lys 
450 455 460 

!) INFORMATION FOR SEQ ID NO: ISO: 
;i) SEQUENCE CHAJIACTERISTICS : 



wo 99/421 18 PCT/DS99/03265 

122 



(A) LENGTH: 277 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID .NO: 180: 

Ala Gly Asn Val Thr Ser Ala Ser Gly Pro His Arg Phe Gly Ala Pro 
15 10 15 

Asp Arg Gly Ser Gin Arg Arg Arg Arg His Pro Ala Ala Ser Thr Ala 
20 25 30 

Thr Glu Arg Cys Arg Phe Asp Arg His Val Ala Arg Gin Arg Cys Gly 
35 40 45 

Phe Pro Pro Ser Arg Arg Gin Leu Arg Arg Arg Val Ser Arg Glu Ala 
50 55 60 

Thr Thr Arg Arg Ser Gly Arg Arg Asn His Arg Cys Gly Trp His Pro 

65 70 75 80 

Gly Thr Gly Ser His Thr Gly Ala Val Arg Arg Arg His Gin Glu Ala 
35 90 95 

Arg Asp Gin Ser Leu Leu Leu Arg Arg Arg Gly Arg Val Asp Leu Asp 
100 105 110 

Gly Gly Gly Arg Leu Arg Arg Val Tyr Arg Phe Gin Gly Cys Leu Val 
115 120 125 

Val Val Phe Gly Gin His Leu Leu Arg Pro Leu Leu He Leu Arg Val 
130 135 140 

His Arg Glu Asn Leu Val Ala Gly Arg Arg Val Phe Arg Val Lys Pro 
145 150 155 160 

Phe Glu Pro Asp Tyr Val Phe He Ser Arg Met Phe Pro Pro Ser Pro 

165 170 175 

His Val Gin Leu Arg Asp He Leu Ser Leu Leu Gly His Arg Ser Ala 
180 185 190 

Gin Phe Gly His Val Glu Tyr Pro Leu Pro Leu Leu He Glu Arg Ser 
195 200 205 

Leu Ala Ser Gly Ser Arg He Ala Phe Pro Val Val Lys Pro Pro Glu 
210 215 220 

Pro Leu Asp Val Ala Leu Gin Arg Gin Val Glu Ser Val Pro Pro He 
225 230 235 240 



Arg Lys Val Arg Glu Arg Cys Ala Leu Val Ala Arg Phe Glu Leu Pro 
245 250 255 
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Cys Arg Phe Phe Glu lie His Glu Val Gly Phe Thr Gly Arg Gly His 
2S0 265 270 

Pro Arg Arg He Gly 
275 

) IITFORMATION FOR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

{D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 

Arg Val Ala Ala Ser Phe lie Asp Trp Leu Asp Ser Pro Asp Ser Pro 
15 10 15 

Leu Asp Pro Ser Leu Val Ser Ser Leu Leu Asn Ala Val Ser Cys Gly 
20 25 30 

Ala Glu Ser Ser Ala Ser Ser Ser Ala Arg Ser Gly Asn Gly Ser Arg 
35 40 45 

Trp Thr Ser Mec Pro Ser Gly Thr Arg Pro Gly Pro Arg Arg Ala Thr 
50 55 60 

Ser Arg Asp Asp Arg Arg Ser Ala Thr Ser Val He Pro Ser Arg Arg 
55 70 75 80 

Ser Val Ala Pro .Arg Ala Glu Phe Gly Thr Arg Leu Ala Ser His Arg 
35 90 95 

Ala Ser Pro Ser Asn Ala Cys Pro Val Arg He Val Thr Ser Ala Ser 
100 105 110 

Gly Arg Pro lie Ser Ser Pro Pro lie Val Arg Ser Arg Ser Cys Val 
115 120 125 

Asp Lys Asn Gly Arg .Arg Cys Ala Ser Gly Tyr Arg Arg Leu Asn Arg 
130 135 140 

Ala Arg Ser Ser Ser He Ala Ala Arg Cys Arg Thr He Gly Thr Phe 
145 150 155 160 

Arg Arg Ser Arg Tyr Ser Ala Ser Met Arg Val Ser Thr Asn Ser Pro 
165 170 175 

His Val Thr His Gly Val Ala Pro Gly Val Thr Arg Arg He Gly Gly 
180 135 190 



) INFORMATION FOR SEQ ID NO: 182: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 196 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:182: 

Gin Glu Arg Pro Gin Met Cys Gin Arg Val Ser Glu He Glu Pro Arg 

IS 10 15 

Thr Gin Phe Phe Asn Arg Cys Ala Leu Pro His Tyr Trp His Phe Pro 
20 25 30 

Ala Val Ala Val Phe Ser Lys His Ala Ser Leu Asp Glu Leu Ala Pro 
35 40 45 

Arg Asn Pro Arg Arg Ser Ser Arg Arg Asp Ala Glu Asp Arg Arg Val 
50 55 50 

He Phe Ala Ala Thr Leu Val Ala Val Asp Pro Pro Leu Arg Gly Ala 
65 70 75 80 

Gly Gly Glu Ala Asp Gin Leu He Asp Leu Gly Val Cys Arg Arg Gin 
85 90 95 

Ala Gly Arg Val Arg Arg Gly Gin Glu Leu His His Arg His Arg His 
100 105 110 

Gin Gly Ala Ala Pro Asp Leu Arg Arg Arg Arg Arg His Arg Arg Val 
115 120 125 

Gin Gin His Arg Arg Leu Gin Arg Val Arg Gin Leu Arg Arg Tyr Val 
130 135 140 

Gin Thr Ala His His Arg Arg Phe Ala Arg Thr Asp Arg Val Arg His 
145 150 135 " 160 

His Val Arg Gly Pro Ser Asn His Arg Arg Arg Arg Val Tyr Arg Gly 
165 170 175 

Arg His Ser Gly Ala Gly Gly Cys Pro Ala Gly Gly Ala Gly Ser Val 
180 185 190 



Gly Gly Ser Ala 
195 



(2) INFORMATION FOR SEQ ID NO:183: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 311 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 

Val Arg Cys Gly Thr Leu Val Pro Val Pro Met Val Glu Phe Leu Thr 
^5 10 15 

Ser Thr Asn Ala Pro Ser Leu Pro Ser Ala Tyr Ala Glu Val Asp Lys 
20 25 30 

Leu lie Gly Leu Pro Ala Gly Thr Ala Lys Arg Trp lie Asn Gly Tyr 
35 40 45 

Glu Arg Gly Gly Lys Asp His Pro Pro He Leu Arg Val Thr Pro Gly 
50 55 60 

Ala Thr Pro Trp Val Thr Trp Gly Glu Phe Val Glu Thr Arg Met Leu 
65 70 75 80 

Ala Glu Tyr Arg Asp Arg Arg Lys Val Pro He Val Arg Gin Arg Ala 
85 90 95 

Ala He Glu Glu Leu Arg Ala Arg Phe Asn Leu Arg Tyr Pro Leu Ala 
100 105 110 

His Leu Arg Pro Phe Leu Ser Thr His Glu Arg Asp Leu Thr Met Gly 
115 120 125 

Gly Glu Glu He Gly Leu Pro Asp Ala Glu Val Thr He Arg Thr Gly 
130 135 

Gin Ala Leu Leu Gly Asp Ala Arg Tru Leu Ala Ser Leu Val Pro Asn 
145 150 155 150 

Ser Ala Arg Gly Ala Thr Leu Arg Arg Leu Gly He Thr Asp Val Ala 
165 170 ~ 175 

Asp Leu Arg Ser Ser Arg Glu Val Ala Arg Arg Gly Pro Gly Arg Val 
180 135 190 

Pro Asp Gly He Asp Val His Leu Leu Pro Phe Pro Asp Leu Ala Asp 
195 200 205 

Asp Asp Ala Asp Asp Ser Ala Pro His Glu Thr Ala Phe Lys Arg Leu 
210 215 220 

Leu Thr Asn Asp Gly Ser Asn Gly Glu Ser Gly Glu Ser Ser Gin Ser 
225 230 235 240 

He Asn Asp Ala Ala Thr Arg Tyr Met Thr Asp Glu Tyr Arg Gin Phe 
245 250 255 

Pro Thr Arg Asn Gly Ala Gin Arg Ala Leu His Arg Val Val Thr Leu 
260 265 270 



Leu Ala Ala Gly Arg Pro Val Leu Thr His Cys Phe Ala Gly Lys Asp 
275 280 285 
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Arg Thr Gly Phe Val Val Ala Leu Val Leu Glu Ala Val Gly Leu Asp 
290 295 3O0 

Arg Asp Val He Val Ala Asp 
305 310 

(2) INFORMATION FOR SEQ ID NO: 184: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2072 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 



CTCGTGCCGA 


TTCGGCACGA 


GCTGAGCAGC 


CCAAGGGGCC 


GTTCGGCGAA 


GTCATCGAGG 


SO 


CATTCGCCGA 


CGGGCTGGCC 


GGCAAGGGTA 


AGCAAATCAA 


CACCACGCTG 


AACAGCCTGT 


120 


CGCAGGCGTT 


GAACGCCTTG 


AATGAGGGCC 


GCGGCGACTT 


CTTCGCGGTG 


GTACGCAGCC 


180 


TGGCGCTATT 


CGTCAACGrn 


CTACATCAGG 


ACGACCAACA 


GTTCGTCGCG 


TTGAACAAGA 


240 


ACCTTGCGGA 




AGGTTGACCC 


ACTCCGATGC 


GGACCTGTCG 


AACGCCATCC 


300 


AGCAATTCGA 


CAGCTTGCTC 


GCCGTCGCGC 


GCCCGTTCTT 


CGCCAAGAAC 


CGCGAGGTGC 


360 


TGACGCATGA 


CGTCAATAAT 


CTCGCGACCG 


TGACCACCAC 


GTTGCTGCAG 


CCCGATCCGT 


420 


TGGATGGGTT 


GGAGACCGTC 


CTGCACATCT 


TCCCGACGCT 


GGCGGCGAAC 


ATTAACCAGC 


480 


TTTACCATCC 


GACACACGGT 


GGCGTGGTGT 


CGCTTTCCGC 


GTTCACGAAT 


TTCGCCAACC 


540 


CGATGGAGTT 


CATCTGCAGC 


TCGATTCAGG 


CGGGTAGCCG 


GCTCGGTTAT 


CAAGAGTCGG 


600 


CCGAACTCTG 


TGCGCAGTAT 


CTGGCGCCAG 


TCCTCGATGC 


GATCAAGTTC 


AACTACTTTC 


660 


CGTTCGGCCT 


GAACGTGGCC 


AGCACCGCCT 


CGACACTGCC 


TAAAGAGATC 


GCGTACTCCG 


720 


AGCCCCGCTT 


GCAGCCGCCC 


AACGGGTACA AGGACACCAC 


GGTGCCCGGC 


ATCTGGGTGC 


780 


CGGATACGCC 


GTTGTCACAC 


CGCAACACGC 


AGCCCGGTTG 


GGTGGTGGCA 


CCCGGGATGC 


840 


AAGGGGTTCA 


GGTGGGACCG 


ATCACGCAGG 


GTTTGCTGAC 


GCCGGAGTCC 


CTGGCCGAAC 


900 


TCATGGGTGG 


TCCCGATATC 


GCCCCTCCGT 


CGTCAGGGCT 


GCAAACCCCG 


CCCGGACCCC 


960 


CGAATGCGTA 


CGACGAGTAC 


CCCGTGCTGC 


CGCCGATCGG 


TTTACAGGCC 


CCACAGGTGC 


1020 


CGATACCACC 


GCCGCCTCCT 


GGGCCCGACG 


TAATCCCGGG 


TCCGGTGCCA 


CCGGTCTTGG 


1080 


CGGCGATCGT 


GTTCCCAAGA 


GATCGCCCGG 


CAGCGTCGGA 


AAACTTCGAC 


TACATGGGCC 


1140 
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TCTTGTTGCT GTCGCCGGGC CTGGCGACCT TCCTGTTCGG GGTGTCATCT AGCCCCGCCC 1200 

GTGGAACGAT GGCCGATCGG CACGTGTTGA TACCGGCGAT CACCGGCCTG GCGTTGATCG 1260 

CGGCATTCGT CGCACATTCG TGGTACCGCA CAGAACATCC GCTCATAGAC ATGCGCTTGT 1320 

TCCAGAACCG AGCGGTCGCG CAGGCCAACA TGACGATGAC GGTGCTCTCC CTCGGGCTGT 13 80 

TTGGCTCCTT CTTGCTGCTC CCGAGCTACC TCCAGCAAGT GTTGCACCAA TCACCGATGC 1440 

AATCGGGGGT GCATATCATC CCACAGGGCC TCGGTGCCAT GCTGGCGATG CCGATCGCCG 1500 

GAGCGATGAT GGACCGACGG GGACCGGCCA AGATCGTGCT GGTTGGGATC ATGCTGATCG 1560 

CTGCGGGGTT GGGCACCTTC GCCTTTGGTG TCGCGCGGCA AGCGGACTAC TTACCCATTC 1620 

TGCCGACCGG GCTGGCAATC ATGGGCATGG GCATGGGCTG CTCCATGATG CCACTGTCCG 1680 

GGGCGGCAGT GCAGACCCTG GCCCCACATC AGATCGCTCG CGGTTCGACG CTGATCAGCG 174 0 

TCAACCAGCA GGTGGGCGGT TCGATAGGGA CCGCACTGAT GTCGGTGCTG CTCACCTACC 1800 

AGTTCAATCA CAGCGAAATC ATCGCTACTG CAAAGAAAGT CGCACTGACC CCAGAGAGTG 1860 

GCGCCGGGCG GGGGGCGGCG GTTGACCCTT CCTCGCTACC GCGCCAAACC AACTTCGCGG 1920 

CCCAACTGCT GCATGACCTT TCGCACGCCT ACGCGGTGGT ATTCGTGATA GCGACCGCGC 1980 

TAGTGGTCTC GACGCTGATC CCCGCGGCAT TCCTGCCGAA ACAGCAGGCT AGTCATCGAA 2040 

GAGCACCGTT GCTATCCGCA TGACGTCTGC TT 2072 
il) INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 3 base pairs 

(B) TYPE: r.ucleic acid 
iC) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 

TCACCCCGGA GAAGTCGTTC GTCGACGACC TGGACATCGA CTCGCTGTCG ATGGTCGAGA 60 

TCGCCGTGCA GACCGAGGAC AAGTACGGCG TCAAGATCCC CGACGAGGAC CTCGCCGGTC 120 

TGCGTACCGT CGGTGACGTT GTCGCCTACA TCCAGAAGCT CGAGGAAGAA AACCCGGAGG 180 

CGGCTCAGGC GTTGCGCGCG AAGATTGAGT CGGAGAACCC CGATGCGGCA CGAGCAGATC 240 

GGTGCGTTTC ACCCACATCG CAAGCTCGAG ACGCCCGTCG TCCTCTTGCA CGCTCAGCCA 300 

3GTTGGCGTG TCGCCGCCTT CCAGCAAGTG TTCCCACCAC ACGAAGGGAC CCTCGCGAAA 360 

GGTGACTGAT CCGCGGACCA CATAGTCGAT GCCACCGTGG CTGACAATTG CGCCGGGTCC 420 
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GAGTTGGCGG GGGCCGAATT GCGGCATTGC GTCGAAGGCC AGCGGATCCC GGCGCCCGCC 480 

CGGCGTGGCT GGTGTTTTGG GCCGCCGGAT GGCCACGACG AGAACGACGA TGGCGGCGAT 54 0 

GAACAGCGCC ACGGCAATCA CGACCAGCAG ATTTCCCACG CATACCCTCT CGTACCGCTG 600 

CGCCGCGGTT GGTCGATCGG TCGCATATCG ATGGCGCCGT TTAACGTAAC AGCTTTCGCG 660 

GGACCGGGGG TCACAACGGG COAGTTGTCC GGCCGGGAAC CCGGCAGGTC TCGGCCGCGG 720 

TCACCCCAGC TCACTGGTGC ACCATCCGGG TGTCGGTGAG CGTGCAACTC AAACACACTC 780 

AACGGCAACG GTTTCTCAGG TCACCAGCTC AACCTCGACC CGCAATCGCT CGTACGTTTC 34 0 

GACCGCGCGC AGGTCGCGAG TCAGCAGCTT TGCGCCGGCA GCTTTCGCCG TGAAGCCGAC 900 

CAGGGCATCG TAGGTTGCGC CACCGGTGAC ATCGTGCTCG GCGAGGTGGT CGGTCAAGCC 960 

GCGATATGAG CAGGCATCCA GTGCCAGGTA GTTGCTGGAG GTGATGTCCG CCAAGTAGGC 1020 

GTGGACGGCA ACAGGGGCAA TACGATGCGG CGGTGGTAGC CGGGTCAAGA CCGAATAGGT 10 80 

TTCCACAGCC GCGTGCGCGA TCAGATGGAC GCCACGGTTG AGCGCGCGCA CGGCGGCCTC 114 0 

GTGCCCTTCG TGCCAGGTCG CGAATCCGGC AACCAGCACG CTGGTGTCTG GTGCGATCAC 12 0 0 

CGCCGTGTGC GATCGAGCGT TTCCCGAACG ATTTCGTCGG TCAACGGGGG CAGGGGACGT 126 0 

TCTGGCCGTG CGACGAGAAC CGAGCCTTCC CGAACGAGTT CGACACCGGT CGGGGCCGGC 132 0 

TCAATCTCGA TGCGCCCATC GCGCTCGGTG ATCTCCACCT GGTCGTTCCC GCGCAAGCCA 13 80 

AGGCGCTCGC GAATCCGCTT GGGAATCACC AGACGTCGTG CGACATCGAT GGTTGTTCGC 144 0 

ATGGTAGGAA ATTTACCATC GCACGTTCCA TAGGCGTGTC CTGCGCGGGA TGTCGGGACG 15 00 

ATCCGCTAGC GTATCGAACG ATTGTTTCGG AAATGGCTGA GGGAGCGTGC GGTGCGGGTG 15 6 0 

ATGGGTGTCG ATCCCGGGTT GACCCGATGC GGGCTGTCGC TCATCGAGAG TGGGCGTGGT 162 0 

CGGCAGCTCA CCGCGCTGGA TGTCGACGTG GTGCGCACAC CGTCGGATGC GGCCTTGGCG 16 8 0 

CAGCGCCTGT TGGCCATCAG CGATGCCGTC GAGCACTGGC TGGACACCCA TCATCCGGAG 1740 

GTGGTGGCTA TCGAACGGGT GTTCTCTCAG CTCAACGTGA CCACGGTGAT GGGCACCGCG 18 00 

CAGGCCGGCG GCGTGATCGC CCTGGCGGCG GCCAAACGTG GTGTCGACGT GCATTTCCAT 1860 

ACCCCCAGCG AGGTCAAGGC GGCGGTCACT GGCAACGGTT CCGCAGACAA GGCTCAGGTC 192 0 

ACC 1923 
(2) INFORMATION FOR SEO ID NO: 186: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1055 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 

CTGGCGTGCC AGTGTCACCG GCGATATGAC GTCGGCATTC AATTTCGCGG CCCCGCCGGA 60 

CCCGTCGCCA CCCAATCTGG ACCACCCGGT CCGTCAATTG CCGAAGGTCG CCAAGTGCGT 12 0 

GCCCAATGTG GTGCTGGGTT TCTTGAACGA AGGCCTGCCG TATCGGGTGC CCTACCCCCA 18 0 

AACAACGCCA GTCCAGGAAT CCGGTCCCGC GCGGCCGATT CCCAGCGGCA TCTGCTAGCC 24 0 

GGGGATGGTT CAGACGTAAC GGTTGGCTAG GTCGAAACCC GCGCCAGGGC CGCTGGACGG 3 00 

GCTCATGGCA GCGAAATTAG AAAACCCGGG ATATTGTCCG CGGATTGTCA TACGATGCTG 3 60 



420 



720 
780 
840 
900 



AGTGCTTGGT GGTTCGTGTT TAGCCATTGA GTGTGGATGT GTTGAGACCC TGGCCTGGAA 
GGGGACAACG TGCTTTTGCC TCTTGGTCCG CCTTTGCCGC CCGACGCGGT GGTGGCGAAA 4 80 

CGGGCTGAGT CGGGAATGCT CGGCGGGTTG TCGGTTCCGC TCAGCTGGGG AGTGGCTGTG 54 0 

CCACCCGATG ATTATGACCA CTGGGCGCCT GCGCCGGAGG ACGGCGCCGA TGTCGATGTC 60 0 

CAGGCGGCCG AAGGGGCGGA CGCAGAGGCC GCGGCCATGG ACGAGTGGGA TGAGTGGCAG 660 
GCGTGGAACG AGTGGGTGGC GGAGAACGCT GAACCCCGCT TTGAGGTGCC ACGGAGTAGC 
AGCAGCGTGA TTCC3CATTC TCCGGCGGCC GGCTAGGAGA GGGGGCGCAG ACTGTCGTTA 
TTTGACCAGT GATCGGCGGT CTCGGTGTTC CCGCGGCCGG CTATGACAAC AGTCAATGTG 
CATGACAAGT TACAGGTATT AGGTCCAGGT TCAACAAGGA GACAGGCAAC ATGGCAACAC 
GTTTTATGAC GGATCCGCAC GCGATGCGGG ACATGGCGGG CCGTTTTGAG GTGCACGCCC 960 
AGACGGTGGA GGACGAGGCT CGCCGGATGT GGGCGTCCGC GCAAAACATC TC3GGNGCGG 102 0 
GCTGGAGTGG CATGGCCGAG GCGACCTCGC TAGAC 105 5 

(2) INFORMATION FOR SEQ ID NO: 187: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

txi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 
rCGCrTCGTT GTTGGCATAC TCCGCCGCGG CCGCCTCGAC CGCACTGGCC GTGGCGTGTG 6 0 
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TCCGGGCTGA CCACCGGGAT CGCCGAACCA TCCGAGATCA CCTCGCAATG ATCCACCTCG 120 

CGCAGCTGGT CACCCAGCCA CCGGGCGGTG TGCGACAGCG CCTGCATCAC CTTGGTATAG 180 

CCGTCGCGCC CCAGCCGCAG GAAGTTGTAG TACTGGCCCA CCACCTGGTT ACCGGGACGG 240 

GAGAAGTTCA GGGTGAAGGT CGGCATGTCG CCGCCGAGGT AGTTGACCCG GAAAACCAGA 3 00 

TCCTCCGGCA GGTGCTCGGG CCCGCGCCAC ACGACAAACC CGACGCCGGG ATAGGTCAG 359 
(2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 350 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 

AACGGGCCCG TGGGCACCGC TCCTCTAAGG GCTCTCGTTG GTCGCATGAA GTGCTGGAAG 6 0 

GATGCATCTT GGCAGATTCC CGCCAGAGCA AAACAGCCGC TAGTCCTAGT CCGAGTCGCC 120 

CGCAAAGTTC CTCGAATAAC TCCGTACCCG GAGCGCCAAA CCGGGTCTCC TTCGCTAAGC 180 

TGCGCGAACC ACTTGAGGTT CCGGGACTCC TTGACGTCCA GACCGATTCG TTCGAGTGGC 240 

TGATCGGTTC GCCGCGCTGG CGCGAATCCG CCGCCGAGCG GGGTGATGTC AACCCAGTGG 3 00 

3TGGCCTGGA AGAGGTGCTC TACGAGCTGT CTCCGATCGA GGACTTCTCC 3 50 
(2) INFORMATION FOR SEQ ID NO: 189: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 67 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 

Glu Gin Pro Lys Gly Pro Phe Gly Glu Val He Glu Ala Phe Ala Asp 
^5 10 15 

Gly Leu Ala Gly Lys Gly Lys Gin He Asn Thr Thr Leu Asn Ser Leu 
20 25 30 

Ser Gin Ala Leu Asn Ala Leu Asn Glu Gly Arg Gly Asp Phe Phe Ala 
35 40 45 

Val Val Arg Ser Leu Ala Leu Phe Val Asn Ala Leu His Gin Asp Asp 
50 55 60 
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Gin Gin Phe Val Ala Leu Asn Lya Asn Leu Ala Glu Phe Thr Asp Arg 
S5 70 75 80 

Leu Thr His Ser Asp Ala Asp Leu Ser Asn Ala lie Gin Gin Phe Asp 
85 90 95 

Ser Leu Leu Ala Val Ala Arg Pro Phe Phe Ala Lys Asn Arg Glu Val 
100 105 110 

Leu Thr His Asp Val Asn Asn Leu Ala Thr Val Thr Thr Thr Leu Leu 
115 120 125 

Gin Pro Asp Pro Leu Asp Gly Leu Glu Thr Val Leu His He Phe Pro 
130 135 140 

Thr Leu Ala Ala Asn He Asn Gin Leu Tyr His Pro Thr His Gly Gly 
145 150 155 160 

Val Val Ser Leu Ser Ala Phe Thr Asn Phe Ala Asn Pro Met Glu Phe 
165 170 175 

He Cys Ser Ser He Gin Ala Gly Ser Arg Leu Gly Tyr Gin Glu Ser 
180 185 190 

Ala Glu Leu Cys Ala Gin Tyr Leu Ala Pro Val Leu Asp Ala He Lys 
195 200 205 

Phe Asn Tyr Phe Pro Phe Gly Leu Asn Val Ala Ser Thr Ala Ser Thr 
210 215 220 

Leu Pro Lys Glu He Ala Tyr Ser Glu Pro Arg Leu Gin Pro Pro Asn 
225 230 235 240 

Gly Tyr Lys Asp Thr Thr Val Pro Gly He Trp Val Pro Asp Thr Pro 
245 250 255 

Leu Ser His Arg Asn Thr Gin Pro Gly Trp Val Val Ala Pro Gly Met 
260 265 270 

Gin Gly Val Gin Val Gly Pro He Thr Gin Gly Leu Leu Thr Pro Glu 
275 280 285 

Ser Leu Ala Glu Leu Met Gly Gly Pro Asp He Ala Pro Pro Ser Ser 
290 295 300 

Gly Leu Gin Thr Pro Pro Gly Pro Pro Asn Ala Tyr Asp Glu Tyr Pro 
305 310 315 320 

Val Leu Pro Pro He Gly Leu Gin Ala Pro Gin Val Pro He Pro Pro 
325 330 335 

Pro Pro Pro Gly Pro Asp Val He Pro Gly Pro Val Pro Pro Val Leu 
340 345 350 



Ala Ala He Val Phe Pro Arg Asp Arg Pro Ala Ala Ser Glu Asn Phe 
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355 



360 



365 



Asp Tyr Met Gly Leu Leu Leu Leu Ser Pro Gly Leu Ala Thr Phe Leu 
370 375 380 

Phe Gly Val Ser Ser Ser Pro Ala Arg Gly Thr Met Ala Asp Arg His 
385 390 395 400 

Val Leu He Pro Ala He Thr Gly Leu Ala Leu He Ala Ala Phe Val 
405 410 415 

Ala His Ser Trp Tyr Arg Thr Glu His Pro Leu He Asp Met Arg Leu 
420 425 430 

Phe Gin Asn Arg Ala Val Ala Gin Ala Asn Met Thr Met Thr Val Leu 
435 440 445 

Ser Leu Gly Leu Phe Gly Ser Phe Leu Leu Leu Pro Ser Tyr Leu Gin 
450 455 460 

Gin Val Leu His Gin Ser Pro Met Gin Ser Gly Val His He He Pro 
465 470 475 480 

Gin Gly Leu Gly Ala Met Leu Ala Met Pro He Ala Gly Ala Met Met 
485 490 495 

Asp Arg Arg Gly Pro Ala Lys He Val Leu Val Gly He Met Leu He 
500 505 510 

Ala Ala Gly Leu Gly Thr Phe Ala Phe Gly Val Ala Arg Gin Ala Asp 
515 520 525 

Tyr Leu Pro He Leu Pro Thr Gly Leu Ala He Met Gly Met Gly Met 
530 535 540 

Gly Cys Ser Met Met Pro Leu Ser Gly Ala Ala Val Gin Thr Leu Ala 
545 550 555 560 

Pro His Gin He Ala Arg Gly Ser Thr Leu He Ser Val Asn Gin Gin 
565 570 575 

Val Gly Gly Ser He Gly Thr Ala Leu Met Ser Val Leu Leu Thr Tyr 
580 585 590 

Gin Phe Asn His Ser Glu He He Ala Thr Ala Lys Lys Val Ala Leu 
595 600 605 

Thr Pro Glu Ser Gly Ala Gly Arg Gly Ala Ala Val Asp Pro Ser Ser 
610 615 620 

Leu Pro Arg Gin Thr Asn Phe Ala Ala Gin Leu Leu His Asp Leu Ser 
625 630 635 640 

His Ala Tyr Ala Val Val Phe Val He Ala Thr Ala Leu Val Val Ser 



645 



650 



655 
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Thr Leu lie Pro AJa Ala Phe Leu Pro Lys Gin Gin Ala Ser His Arg 
660 665 670 

Arg Ala Pro Leu Leu Ser Ala 
675 

(2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 120 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 

Tlir Pro Glu Lys Ser Phe Val Asp Asp Leu Asp lie Asp Ser Leu Ser 
5 10 15 

Met Val Glu He Ala Val Gin Thr Glu Asp Lys Tyr Gly Val Lys He 
20 25 30 

Pro Asp Glu Asp Leu Ala Gly Leu Arg Thr Val Gly Asp Val Val Ala 
35 40 ' 45 

Tyr He Gin Lys Leu Glu Glu Glu Asn Pro Glu Ala Ala Gin Ala Leu 
50 55 60 

Arg Ala Lys He Glu Ser Glu Asn Pro Asp Ala Ala Arg Ala Asp Arg 

65 70 75 30 

Cys Val Ser Pro Thr Ser Gin Ala Arg Asp Ala Arg Arg Pro Leu Ala 
85 90 95 

Arg Ser Ala Arg Leu Ala Cys Arg Arg Leu Pro Ala Ser Val Pro Thr 
100 105 110 

Thr .Arg Arg Asp Pro .Arg Glu Arg 
115 120 

(2) INFORMATION FOR SEQ ID NO: 191: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY; l:Lnear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191: 

Leu Ala Cys Gin Cys His Arg Arg Tyr Asp Val Gly He Gin Phe Arg 
1 5 10 15 



Gly Pro Ala Gly Pro Val Ala Thr Gin Ser Gly Pro Pro Gly Pro Ser 



wo 99/42118 



134 



PCT/US99/03265 



20 



25 



30 



lie Ala Glu Gly 
35 

Glu Arg Arg Pro 
SO 



Arg Gin Val Arg 
40 

Ala Val Ser Gly 
55 



Ala Gin Cys Gly 

Ala Leu Pro Pro 
60 



Ala Gly Phe Leu 
45 

Asn Asn Ala Ser 



Pro Gly He Arg Ser Arg Ala Ala Asp Ser Gin Arg His Leu Leu Ala 
65 70 75 80 

Gly Asp Gly Ser Asp Val Thr Val Gly 
85 



(2) INFORMATION FOR SEQ ID NO: 192: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 119 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 

Ala Ser Leu Leu Ala Tyr Ser Ala Ala Ala Ala Ser Thr Ala Leu Ala 
^5 10 15 

Val Ala Cys Val Arg Ala Asp His Arg Asp Arg Arg Thr He Arg Asp 
20 25 30 

His Leu Ala Met lie His Leu Ala Gin Leu Val Thr Gin Pro Pro Gly 
35 40 45 

Gly Val -Arg Gin Arg Leu His His Leu Gly lie Ala Val Ala Pro Gin 
50 55 60 

Pro Gin Glu Val Val Val Leu Ala His His Leu Val Thr Gly Thr Gly 
S5 70 75 80 

Glu Val Gin Gly Glu Gly Arg His Val Ala Ala Glu Val Val Asp Pro 
85 90 95 

Glu Asn Gin He Leu Arg Gin Val Leu Gly Pro Ala Pro His Asp Lys 
100 105 110 

Pro Asp Ala Gly lie Gly Gin 
115 



(2) INFORMATION FOR SEQ ID NO: 193: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 116 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 

Arg Ala Axg Gly His Arg Ser Ser Lys Gly Ser Arg Trp Ser His Glu 
15 10 15 

Val Leu Glu Gly Cys He Leu Ala Asp Ser Arg Gin Ser Lys Thr Ala 
20 25 30 

Ala Ser Pro Ser Pro Ser Arg Pro Gin Ser Ser Ser Asn Asn Ser Val 
35 40 45 

Pro Gly Ala Pro Asn Arg Val Ser Phe Ala Lys Leu Arg Glu Pro Leu 
50 55 60 

Glu Val Pro Gly Leu Leu Asp Val Gin Thr Asp Ser Phe Glu Trp Leu 

65 70 75 BO 

lie Gly Ser Pro Arg Trp Arg Glu Ser Ala Ala Glu Arg Gly Asp Val 
85 90 95 

Asn Pro Val Gly Gly Leu Glu Glu Val Leu Tyr Glu Leu Ser Pro He 
100 105 110 

Glu Asp Phe Ser 

115 

(2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS: 

lA) LENGTH: Bll base pairs 
(3) TYPE: nucleic acid 
;C) 3TRANDEDNESS : single 
(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 

TGCTACGCAG CAATCGCTTT GGTGACAGAT GTGGATGCCG GCGTCGCTGC TGGCGATGGC 60 

GTGAAAGCCG CCGACGTGTT CGCCGCATTC GGGGAGAACA TCGAACTGCT CAAAAGGCTG 12 0 

GTGCGGGCCG CCATCGATCG GGTCGCCGAC GAGCGCACGT GCACGCACTG TCAACACCAC 18 0 

GCCGGTGTTC CGTTGCCGTT CGAGCTGCCA TGAGGGTGCT GCTGACCGGC GCGGCCGGCT 24 0 

TCATCGGGTC GCGCGTGGAT GCGGCGTTAC GGGCTGCGGG TCACGACGTG GTGGGCGTCG 300 

ACGCGCTGCT GCCCGCCGCG CACGGGCCAA ACCCGGTGCT GCCACCGGGC TGCCAGCGGG 360 

TCGACGTGCG CGACGCCAGC GCGCTGGCCC CGTTGTTGGC CGGTGTCGAT CTGGTGTGTC 420 

ACCAGGCCGC OVTGGTGGGT GCCGGCGTCA ACGCCGCCGA CGCACCCGCC TATGGCGGCC 48 0 

ACAACGATTT CGCCACCACG GTGCTGCTGG CGCAGATGTT CGCCGCCGGG GTCCGCCGTT 54 0 
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TGGTGCTGGC GTCGTCGATG GTGGTTTACG GGCAGGGGCG CTATGACTGT CCCCAGCATG 



600 



GACCGGTCGA CCCGCTGCCG CGGCGGCGAG CCGACCTGGA CAATGGGGTC TTCGAGCACC 



660 



GTTGCCCGGG GTGCGGCGAG CCAGTCATCT GGCAAITGGT CGACGAAGAT GCCCCGTTGC 



720 



GCCCGCGCAG CCTGTACGCG GCAGCAAGAC CGCGCAGGAG CACTACGCGC TGGCGTGGTC 



780 



GGAAACGAAT GGCGGTTCCG TGGTGGCGTT G 



811 



(2) INFORMATION FOR SEQ ID NO: 195: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 966 base pairs 

(B) TYPE: nucleic acid 

(C) STRANBEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 

GT.CCCGCGAT GTGGCCGAGC ATGACTTTCG GCAACACCGG CGTAGTAGTC GAAGATATCG 6 0 

GACTTTGTGG TCCCGGTGGC GGGATAGAGC ACCTGTCGGC GTTGGTCAGC GTCACCCGTT 12 0 

GCTCGGACGC CGAACCCATG CTTTCAACGT AGCCTGTCGG TCACACAAGT CGCGAGCGTA 180 

ACGTCACGGT CAAATATCGC GTGGAATTTC GCCGTGACGT TCCGCTCGCG GACAATCAAG 240 

GCATACTCAC TTACATGCGA GCCATTTGGA CGGGTTCGAT CGCCTTCGGG CTGGTGAACG 300 

TGCCGGTCAA GGTGTACAGC GCTACCGCAG ACCACGACAT CAGGTTCCAC CAGGTGCACG 360 

CCAAGGACAA CGGACGCATC CGGTACAAGC GCGTCTGCGA GGCGTGTGGC GAGGTGGTCG 42 0 

ACTACCGCGA TCTTGCCCGG GCCTACGAGT CCGGCGACGG CCAAATGGTG GCGATCACCG 480 

ACGACGACAT CGCCAGCTTG CCTGAAGAAC GCAGCCGGGA GATCGAGGTG TTGGAGTTCG 54 0 

TCCCCGCCGC CGACGTGGAC CCGATGATGT TCGACCGCAG CTACTTTTTG GAGCCTGATT 60 0 

CGAAGTCGTC GAAATCGTAT GTGCTGCTGG CTAAGACACT CGCCGAGACC GACCGGATGG 660 

CGATCGTGGA TCGCCCCACC GGCCGTGAAT GCAGGAAAAA TAAGAGCCGC TATCCACAAT 72 0" 

TCGGCGTCGA GCTCGGCTAC CACAAACGGT AGAACGATCG AGACATTCCC GAGCTGAAGT 78 0 

GCGGCGCTAT AGAAGCCGCT CTGCGCGATT ATCAAACGCA AAATACGCTT ACTCATGCCA 84 0 

TCGGCGCTGC TCACCCGATG CGACGTTTrT GCCACGCTCC ACCGCCTGCC GCGCGACCTC 900 

AAGTGGGCAT GCATCCCACC CGTTCCCGGA AACCGGTTCC GGCGGGTCGG CTCATCGCTT 960 

CATCCT 965 
(2^ INFORMATION FOR SEQ ID NO: 196: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2367 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
{D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196: 

CCGCACCGCC GGCAATACCG CCAGCGCCAC CGTTACCGCC GTTTGCGCCG TTGCCCCCGT 60 

TGCCGCCCGT CCCGCCGGCC CCGCCGATGG AGTTCTCATC GCCAAAAGTA CTGGCGTTGC 12 0 

CACCGGAGCC GCCGTTGCCG CCGTCACCGC CAGCCCCGCC GACTCCACCG GCCCCACCGA 180 

CTCCGCCGCT GCCACCGTTG CCGCCGTTGC CGATCAACAT GCCGCTGGCG CCACCCTTGC 240 

ZACCCACGCZ ACCGGCTCCG CCCACCCCGC CGACACCAAG CGAGCTGCCG CCGGAGCCAC 3 00 

CATCACCACC TACGCCACCG ACCGCCCAGA CACCAGCGAC CGGGTCTTCG TGAAAC3TC3 3 60 

CGGTGCCACC ACCGCCGCCG TTACCGCCAA CCCCACCGGC AACGCCGGCG CCGCCATCCC 42 0 

CQCCGQCCCC GGCGTTGCCG CCGTTGCCGC CGTTGCCGAA CAACAACCCG CCGGCGCCGC 480 

CGTTGCCGCC CGCGCCGCCG GTCCCGCCGG CGCCGCCGAC GCCAAGGCCG CTGCCGCCCT 54 0 

TGCCGCCATC ACCACCCTTG CCGCCGACCA CATCGGGTTC TGCCTCGGGG TCTGGGCTGT 600 

CAAACCTCGC GATGCCAGCG TTGCCGCCGC TTCCCCCGGG CCCCCCCGTG GCGCCGTCAC 660 

CACCGATACC ACCCGCGCCA CCGGCGCCAC CGTTGCCGCC ATCACCGAAT AGCAACCCGC 72 0 

CGGCGCCACC ATTGCCGCCA 3CTCCCCCTG CGCCACCGTC GGCGCCGGAG GCGGCACTGG 780 

CAGCCCCGTT ACCACCGAAA CCGCCGCTAC CACCGGTAGA GGTGGCAGTG GCGATGTGTA 84 0 

CGAAAGCGCC GCCTCCGGCG CCGCCGCTAC CACCCCCACT GCCGGCGGCT ACACCGTCGG 900 

ACCCGTTGCC ACCATCACCG CCAAAGGCGC TCGCAATGTC GCCCTGCGCG ACTCCGCCGT 960 

CGCCGCCGTT GCC:iCCOCCZ CCACCGGCAG CGGCGGTACC GCCGTCACCA CCGGCACCGC 10 20 

CGGTGGCCTT GCCCGAGCCT GCCGTCGCGG TGGCACCGTC GCCGCCGGTG CCACCGGTCG 10 8 0 

GCGTGCCGGC AGTGCCATGG CCGCCCGTGC CGCCGTCGCC GCCGGTTTGA TCACCGATGC 1140 

CGGACACATC TGCCGGGCTG TCCCCGGTGC TGGCCGCGGG GCCGGGCGTG GGATTGACCC 12 0 0 

CGTTTGCCCC GGCGAGGCCG GCGCCGCCGG TACCACCGGC GCCGCCATGG CCGAACAGCC 12 6 0 

CGGCGTTGCC GCCGTTACCG CCCGCACCCC CGATGCCTGC GGCCACGCTG GTGCCGCCGA 132 0 

CACCGCCGTT GCCGCCGTTG CCCCACAACC ACCCCCCGTT CCCACCGGCA CCGCCGGCCG 13 8 0 
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CGCCGGTACC ACCGGCCCCG CCGTTGCCGC CGTTGCCGAT CAACCCGGCC GCGCCTCCGC 144 0 

TGCCGCCGGT TTGACCGAAC CCGCCAGCCG CGCCGTTGCC ACCGTTGCCA AACAGCAACC 1500 

CGCCGGCCGC GCCAGGCTGC CCGGGTGCCG TCCCGTCGGC GCCGTTTCCG ATCAACGGGC 1560 

GCCCCAAAAG CGCCTCGGTG GGCGCATTCA CCGCACCCAG CAGACTCCGC TCAACAGCGG 162 0 

CTTCAGTGCT GGCATACCGA CCCGCGGCCG CAGTCAACGC CTGCACAAAC TGCTCGTGAA 1680 

ACGCTGCCAC CTGTACGCTG AGCGCCTGAT ACTGCCGAGC ATGGGCCCCG AACAACCCCG 174 0 

CAATCGCCGC CGACACTTCA TCGGCAGCCG CAGCCACCAC TTCCGTCGTC GGGATCGCCG 180 0 

CGGCCGCATT AGCCGCGCTC ACCTGCGAAC CAATAGTCGA TAAATCCAAA GCCGCAGTTG 1360 

CCAGCAGCTG CGGCGTCGCG ATCACCAAGG ACACCTCGCA CCTCCGGATA CCCCATATCG 1920 

CCGCACCGTG TCCCCAGCGG CCACGTGACC TTTGGTCGCT GGCTGGCGGC CCTGACTATG 1980 

GCCGCGACGG CCCTCGTTCT GATTCGCCCC GGCGCGCAGC TTGTTGCGCG AGTTGAAGAC 204 0 

GGGAGGACAG GCCGAGCTTG GTGTAGACGT GGGTCAAGTG GGAATGCACG GTCCGCGGCG 2100 

AGATGAATAG GCGGACGCCG ATCTCCTTGT TGCTGAGTCC CTCACCGACC AGTAGAGCCA 2160 

CCTCAAGCTC TGTCGGTGTC AACGCGCCCC AGCCACTTGT CGGGCGTTTC CGTGCACCGC 222 0 

GGCCTCGTTG CGCGTACGCG A7CGCCTCAT CGATCGATAA CGCAGTTCCT TCGGCCCAGG 2280 

CATCGTCGAA CTCGCTGTCA CCGATGGATT TTCGAAGGGT GGCTAGCGAC GAGTTACAGC 2340 

CCGCCTGGTA GATCCCGAAG CGGACCG 2 3 67 
;2) INFORMATION FOR SEQ ID NO: 197: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 376 ammo acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

Gin Pro Ala Gly Ala Thr He Ala Ala Ser Ser Pro Cys Ala Thr Val 
^5 10 15 

Gly Ala Gly Gly Gly Thr Gly Ser Pro Val Thr Thr Glu Thr Ala Ala 
20 25 30 

Thr Thr Gly Arg Gly Gly Ser Gly Asp Val Tyr Glu Ser Ala Ala Ser 
35 40 45 

Gly Ala Ala Ala Thr Thr Pro Thr Ala Glv Glv Tvr Thr Val Gly Pro 
50 55 ' ' 60 
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Val Ala Thr lie Thr Ala Lys Gly Ala Arg Asn Val Ala Leu Arg Asp 
65 70 75 80 

Ser Ala Val Ala Ala Val Ala Ala Ala Ala Thr Gly Ser Gly Gly Thr 
85 90 95 

Ala Val Thr Thr Gly Thr Ala Gly Gly Leu Ala Arg Ala Cys Arg Arg 
100 105 110 

Gly Gly Thr Val Ala Ala Gly Ala Thr Gly Arg Arg Ala Gly Ser Ala 
115 120 125 

Met Ala Ala Arg Ala Ala Val Ala Ala Gly Leu lie Thr Asp Ala Gly 
130 135 140 

His He Cys Arg Ala Val Pro Gly Ala Gly Arg Gly Ala Gly Arg Gly 
145 150 155 160 

He Asp Pro Val Cys Pro Gly Glu Ala Gly Ala Ala Gly Thr Thr Gly 
165 170 175 

Ala Ala Met Ala Glu Gin Pro Gly Val Ala Ala Val Thr Ala Arg Thr 
180 185 190 

Pro Asp Ala Cys Gly His Ala Gly Ala Ala Asp Thr Ala Val Ala Ala 
195 200 205 

Val Ala Pro Gin Pro Pro Pro Val Pro Thr Gly Thr Ala Gly Arg Ala 
210 ' 215 220 

Gly Thr Thr Gly Pro Ala Val Ala Ala Val Ala Asp Gin Pro Gly Arg 
225 230 235 240 

Ala Ser Ala Ala Ala Gly Leu Thr Glu Pro Ala Ser Arg Ala Val Ala 
245 250 255 

Thr Val Ala Lys Gin Gin Pro Ala Gly -Arg Ala Arg Leu Pro Gly Cys 
260 265 270 

Arg Pro Val Gly Ala Val Ser Asp Gin Arg Ala Pro Gin Lys Arg Leu 
275 280 285 

Gly Gly Arg lie His Arg Thr Gin Gin Thr Pro Leu Asn Ser Gly Phe 
290 295 300 

Ser Ala Gly He Pro Thr Arg Gly Arg Ser Gin Arg Leu His Lys Leu 
305 310 315 320 

Leu Val Lys Arg Cys His Leu Tyr Ala Glu Arg Leu lie Leu Pro Ser 
325 330 335 



Met Gly Pro Glu Gin Pro Arg Asn Arg Arg Arg His Phe He Gly Ser 
340 345 350 
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Arg Ser His His Phe Arg Arg Arg Asp Arg Arg Gly Arg He Ser Arg 
355 360 365 

Ala His Leu Arg Thr Asn Ser Arg 
370 375 

(2) INFORMATION FOR SEQ ID HO: 198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2852 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 

GGCCAAAACG CCCCGGCGAT CGCGGCCACC GAGGCCGCCT ACGACCAGAT GTGGGCCCAG 60 

GACGTGGCGG CGATGTTTGG CTACCATGCC GGGGCTTCGG CGGCCGTCTC GGCGTTGACA 120 

CCGTTCGGCC AGGCGCTGCC GACCGTGGCG GGCGGCGGTG CGCTGGTCAG CGCGGCCGCG 180 

GCTCAGGTGA CCACGCGGGT CTTCCGCAAC CTGGGCTTGG CGAACGTCCG CGAGGGCAAC 24 0 

GTCCGCAACG GTAATGTCCG GAACTTCAAT CTCGGCTCGG CCAACATCGG CAACGGCAAC 3 00 

ATCGGCAGCG GCAACATCGG CAGCTCCAAC ATCGGGTTTG GCAACGTGGG TCCTGGGTTG 360 

ACCGCAGCGC TGAACAACAT CGGTTTCGGC AACACCGGCA GCAACAACAT CGGGTTTGGC 420 

-=^CACCGGCA GCAACAACAT CGGGTTCGGC AATACCGGAG ACGGCAACCG AGGTATCGGG 4 80 

CTCACGGGTA GCGGTTTGTT GGGGTTCGGC GGCCTGAACT CGGGCACCGG CAACATCGGT 54 0 

CTGTTCAACT CGGGCACCGG AAACGTCGGC ATCGGCAACT CGGGTACCGG GAACTGGGGC 6 00 

ATTGGCAACT CGGGCAACAG CTACAACACC GGTTTTGGCA ACTCCGGCGA CGCCAACACG 660 

GGCTTCTTCA ACTCCGGAAT AGCCAACACC GGCGTCGGCA ACGCCGGCAA CTACAACACC 72 0 

GGTAGCTACA ACCCGGGCAA CAGCAATACC GGCGGCTTCA ACATGGGCCA GTACAACACG 780 

GGCTACCTGA ACAGCGGCAA CTACAACACC GGCTTGGCAA ACTCCGGCAA TGTCAACACC 84 0 

GGCGCCTTCA TTACTGGCAA CTTCAACAAC GGCTTCTTGT GGCGCGGCGA CCACCAAGGC 90 0 

CTGATTTTCG GGAGCCCCGG CTTCTTCAAC TCGACCAGTG CGCCGTCGTC GGGATTCTTC 960 

AACAGCGGTG CCGGTAGCGC GTCCGGCTTC CTGAACTCCG GTGCCAACAA TTCTGGCTTC 102 0 

TTCAACTCTT CGTCGGGGGC CATCGGTAAC TCCGGCCTGG CAAACGCGGG CGTGCTGGTA 1080 

TCGGGCGTGA TCAACTCGGG CAACACCGTA TCGGGTTTGT TCAACATGAG CCTGGTGGCC 114 0 

ATCACAACGC CGGCCTTGAT CTCGGGCTTC TTC.VVCACCG GAAGCAACAT GTCGGGATTT 12 0 0 



wo 99/42118 



141 



PCTAJS99/03265 



TTCGGTGGCC CACCGGTCTT CAATCTCGGC CTGGCAAACC GGGGCGTCGT GAACATTCTC 1260 

GGCAACGCCA ACATCGGCAA TTACAACATT CTCGGCAGCG GAAACGTCGG TGACTTCAAC 1320 

ATCCTTGGCA GCGGCAACCT CGGCAGCCAA AACATCTTGG GCAGCGGCAA CGTCGGCAGC 13 80 

TTCAATATCG GCAGTGGAAA CATCGGAGTA TTCAATGTCG GTTCCGGAAG CCTGGGAAAC 144 0 

TACAACATCG GATCCGGAAA CCTCGGGATC TACAACATCG GTTTTGGAAA CGTCGGCGAC 1500 

TACAACGTCG GCTTCGGGAA CGCGGGCGAC TTCAACCAAG GCTTTGCCAA CACCGGCAAC 1560 

AACAACATCG GGTTCGCCAA CACCGGCAAC AACAACATCG GCATCGGGCT GTCCGGCGAC 1620 

AACCAGCAGG GCTTCAATAT TGCTAGCGGC TGGAACTCGG GCAGCGGCAA CAGCGGCCTG 168 0 

TTCAATTC3G GCACCAATAA CGTTGGCATC TTCAACGCGG GCACCGGAAA CGTCGGCATC 174 0 

GCAAACTCGG GCACCGGGAA CTGGGGTATC GGGAACCCGG GTACCGACAA TACCGGCATC 18 00 

CTCAATGCTG GCAGCTACAA CACGGGCATC CTCAACGCCG GCGACTTCAA CACGGGCTTC 18 60 

TACAACACGG GCAGCTACAA CACCGGCGGC TTCAACGTCG GTAACACCAA CACCGGCAAC 192 0 

TTCAACGTGG GTGACACCAA TACCGGCAGC TATAACCCGG GTGACACCAA CACCGGCTTC 198 0 

TTCAATCCCG GCAACGTCAA TACCGGCGCT TTCGACACGG GCGACTTCAA CAATGGCTTC 2 04 0 

TTGGTGGCGG GCGATAACCA GGGCCAGATT GCCATCGATC TCTCGGTCAC CACTCCATTC IllOO 

ATCCCCATAA ACGAGCAGAT GGTCATTGAC GTACACAACG TAATGACCTT CGGCGGCAAC 216 0 

ATGATCACGG TCACCGAGGC CTCGACCGTT TTCCCCCAAA CCTTCTATCT GAGCGGTTTG 222 0 

TTCTTCTTCG GCCCGGTCAA TCTCAGCGCA TCCACGCTGA CCGTTCCGAC GATCACCCTC 2 2 80 

ACCATCGGCG GACCGACGGT GACCGTCCCC ATCAGCATTG TCGGTGCTCT GGAGAGCCGC 2 340 

ACGATTACCT TCCTCAAGAT CGATCCGGCG CCGGGCATCG GAAATTCGAC CACCAACCCC 2400 

TCGTCCGGCT TCTTCAACTC GGGCACCGGT GGCACATCTG GCTTCCAAAA CGTCGGCGGC 2460 

GGCAGTTCAG GCGTCTGGAA CAGTGGTTTG AGCAGCGCGA TAGGGAATTC GGGTTTCCAG 252 0 

AACCTCGGCT CGCTGCAGTC AGGCTGGGCG AACCTGGGCA ACTCCGTATC GGGCTTTTTC 2580 

AACACCAGTA CGGTGAACCT CTCCACGCCG GCCAATGTCT CGGGCCTGAA CAACATCGGC 2 64 0 

ACCAACCTGT CCGGCGTGTT CCGCGGTCCG ACCGGGACGA TTTTCAACGC GGGCCTTGCC 270 0 

AACCTGGGCC AGTTGAACAT CGGCAGCGCC TCGTGCCGAA TTCGGCACGA GTTAGATACG 2 76 0 

GTTTCAACAA TCATATCCGC GTTTTGCGGC AGTGCATCAG ACGAATCGAA CCCGGGAAGC 2 32 0 
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GTAAGCGAAT AAACCGAATG GCGGCCTGTC AT 2852 
(2) INFORMATION FOR SEQ ID NO: 199: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 943 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 

Gly Gin Asn Ala Pro Ala lie Ala Ala Thr Glu Ala Ala Tyr Asp Gin 
15 10 15 

Mec Trp Ala Gin Asp Val Ala Ala Met Phe Gly Tyr His Ala Gly Ala 
20 25 30 

Ser Ala Ala Val Ser Ala Leu Thr Pro Phe Gly Gin Ala Leu Pro Thr 
35 40 45 

Val Ala Gly Gly Gly Ala Leu Val Ser Ala Ala Ala Ala Gin Val Thr 
50 55 60 

Thr Arg Val Phe Arg Asn Leu Gly Leu Ala Asn Val Arg Glu Gly Asn 
65 70 75 80 

Val Arg Asn Gly Asn Val Arg Asn Phe Asn Leu Gly Ser Ala Asn He 
85 90 95 

Gly Asn Gly Asn He Gly Ser Gly Asn lie Gly Ser Ser Asn He Gly 
100 105 110 

Phe Gly Asn Val Gly Pro Gly Leu Thr Ala Ala Leu Asn Asn lis Gly 
115 120 125 

Phe Gly Asn Thr Gly Ser Asn Asn He Gly Phe Gly Asn Thr Gly Ser 
130 135 140 

Asn Asn He Gly Phe Gly Asn Thr Gly .\sp Gly Asn Arg Gly He Gly 
145 150 155 160 

Leu Thr Gly Ser Gly Leu Leu Gly Phe Gly Gly Leu Asn Ser Gly Thr 
165 170 175 

Gly Asn He Gly Leu Phe Asn Ser Gly Thr Gly Asn Val Gly He Gly 
180 185 190 

Asn Ser Gly Thr Gly Asn Trp Gly He Gly Asn Ser Gly Asn Ser Tyr 
195 200 205 

Asn Thr Gly Phe Gly Asn Ser Gly Asp Ala Asn Thr Gly Phe Phe Asn 
210 215 220 



Ser Gly He Ala Asn Thr Gly Val Gly Asn Ala Gly Asn Tyr Asn Thr 
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225 230 235 240 

Gly Ser Tyr Asn Pro Gly Asn Ser Asn Thr Gly Gly Phe Asn Met Gly 
245 250 255 

Gin Tyr Asn Thr Gly Tyr Leu Asn Ser Gly Asn Tyr Asn Thr Gly Leu 
260 265 270 

Ala Asn Ser Gly Asn Val Asn Thr Gly Ala Phe lie Thr Gly Asn Phe 
275 280 285 

Asn Asn Gly Phe Leu Trp Arg Gly Asp His Gin Gly Leu lie Phe Gly 
290 295 300 

Ser Pro Gly Phe Phe Asn Ser Thr Ser Ala Pro Ser Ser Gly Phe Phe 
305 310 315 320 

Asn Ser Gly Ala Gly Ser Ala Ser Gly Phe Leu Asn Ser Gly Ala Asn 
325 330 335 

Asn Ser Gly Phe Phe Asn Ser Ser Ser Gly Ala He Gly Asn Ser Gly 
340 345 350 

Leu Ala Asn Ala Gly Val Leu Val Ser Gly Val He Asn Ser Gly Asn 
355 360 365 

Thr Val Ser Gly Leu Phe Asn Met Ser Leu Val Ala He Thr Thr Pro 
370 375 380 

Ala Leu lie Ser Gly Phe Phe Asn Thr Gly Ser Asn Met Ser Gly Phe 
385 390 395 ' 400 

Phe Gly Gly Pro Pro Val Phe Asn Leu Gly Leu Ala Asn Arg Gly Val 
405 410 415 

Val Asn He Leu Gly Asn Ala Asn lie Gly Asn Tyr Asn He Leu Gly 
420 425 430 

Ser Gly Asn Val Gly Asp Phe Asn He Leu Gly Ser Gly Asn Leu Gly 
435 440 445 

Ser Gin Asn He Leu Gly Ser Gly Asn Val Gly Ser Phe Asn He Gly 
450 455 460 

Ser Gly Asn He Gly Val Phe Asn Val Gly Ser Gly Ser Leu Gly Asn 
465 470 475 480 

Tyr Asn He Gly Ser Gly Asn Leu Gly He Tyr Asn He Gly Phe Gly 
485 490 495 

Asn Val Gly Asp Tyr Asn Val Gly Phe Gly Asn Ala Gly Asp Phe Asn 
500 505 510 

Gin Gly Phe Ala Asn Thr Gly Asn Asn Asn He Gly Phe Ala Asn Thr 
515 520 525 
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Gly Asn Asn Asn He Gly He Gly Leu Ser Gly Asp Asn Gin Gin Gly 
530 535 540 

Phe Asn He Ala Ser Gly Trp Asn Ser Gly Thr Gly Asn Ser Gly Leu 
545 550 555 ' 560 

Phe Asn Ser Gly Thr Asn Asn Val Gly He Phe Asn Ala Gly Thr Gly 
565 570 575 

Asn Val Gly He Ala Asn Ser Gly Thr Gly Asn Trp Gly He Gly Asn 
580 585 590 

Pro Gly Thr Asp Asn Thr Gly He Leu Asn Ala Gly Ser Tyr Asn Thr 
595 600 605 

Gly He Leu Asn Ala Gly Asp Phe Asn Thr Gly Phe Tyr Asn Thr Gly 
610 615 620 

Ser Tyr Asn Thr Gly Gly Phe Asn Val Gly Asn Thr Asn Thr Gly Asn 
^25 630 635 640 

Phe Asn Val Gly Asp Thr Asn Thr Gly Ser Tyr Asn Pro Gly Asp Thr 
645 650 655 

Asn Thr Gly Phe Phe Asn Pro Gly Asn Val Asn Thr Gly Ala Phe Asp 
660 665 670 

Thr Gly Asp Phe Asn Asn Gly Phe Leu Val Ala Gly Asp Asn Gin Gly 
675 680 685 

Gin He Ala He Asp Leu Ser Val Thr Thr Pro Phe He Pro He Asn 
690 695 700 

Glu Gin Met Val He Asp Val His Asn Val Met: Thr Phe Gly Gly Asn 
705 710 715 720 

Met He Thr Val Thr Glu Ala Ser Thr Val Phe Pro Gin Thr Phe Tyr 
"^25 730 735 

Leu Ser Gly Leu Phe Phe Phe Gly Pro Val Asn Leu Ser Ala Ser Thr 
740 745 750 

Leu Thr Val Pro Thr He Thr Leu Thr He Gly Gly Pro Thr Val Thr 
755 760 765 

Val Pro He Ser He Val Gly Ala Leu Glu Ser Arg Thr He Thr Phe 
770 775 780 

Leu Lys He Asp Pro Ala Pro Gly He Gly Asn Ser Thr Thr Asn Pro 
785 790 795 800 



Ser Ser Gly Phe Phe Asn Ser Gly Thr Gly Gly Thr Ser Gly Phe Gin 
805 810 ^ 815 
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Asn Val Gly Gly Gly Ser Ser Gly Val Trp Asn Ser Gly Leu Ser Ser 
820 825 830 

Ala He Gly Asn Ser Gly Phe Gin Asn Leu Gly Ser Leu Gin Ser Gly 
835 340 845 

Trp Ala Asn Leu Gly Asn Ser Val Ser Gly Phe Phe Asn Thr Ser Thr 
850 855 860 

Val Asn Leu Ser Thr Pro Ala Asn Val Ser Gly Leu Asn Asn He Gly 
865 870 875 880 

Thr Asn Leu Ser Gly Val Phe Arg Gly Pro Thr Gly Thr He Phe Asn 
885 890 895 

Ala Gly Leu Ala Asn Leu Gly Gin Leu Asn lie Gly Ser Ala Ser Cys 
900 905 910 

Arg He Arg His Glu Leu Asp Thr Val Ser Thr He He Ser Ala Phe 
915 920 925 

Cys Gly Ser Ala Ser Asp Glu Ser Asn Pro Gly Ser Val Ser Glu 
930 935 940 

(2) INFORMATION FOR 3ZQ ID NO: 200: 

(i) SEQUENC2 CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: 3EQ ID NO: 200: 

GGATCCATAT GGGCCATCAT CATCATCATG ACGTGATCGA CATCATCGGG ACC 

(2) INFORMATION FOR SEQ ID NO: 201; 

(i) SEQUENCE CHARACTERISTICS: 
{A} LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 
CCTGAATTCA GGCCTCGGTT GCGCCGGCCT CATCTTGAAC GA 
(2) INFORMATION FOR SEQ ID NO: 202: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 2 02 
GGATCCTGCA GGCTCGAAAC CACCGAGCGG T 
(2) INFORMATION FOR SEQ ID NO:203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(XI ) SEQUENCE DESCRIPTION: SEQ ID NO: 203 
CTCTGAATTC AGCGCTGGAA ATCGTCGCGA T 
(2) INFORMATION FOR SEQ ID NO: 2 04: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 04 
GGATCCAGCG CTGAGATGAA GACCGATGCC GCT 
'2) INFORMATION FOR SEQ ID NO: 2 05: 

(i*) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

{XI } SEQUENCE DESCRIPTION: SEQ ID NO: 2 05 
GGATATCTGC AGAATTCAGG TTTAAAGCCC ATTTGCGA 
(2) INFORMATION FOR SEQ ID NO: 206: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 2 06 
CCGCATGCGA GCCACGTGCC CACAACGGCC 
[2) INFORMATION FOR SEQ ID NO: 207: 



wo 99/42118 



147 



PCT/US99/03265 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 07: 
CTTCATGGAA TTCTCAGGCC GGTAAGGTCC GCTGCGG 37 
(2) INFORMATION FOR SEQ ID NO: 208: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7676 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:208: 

TGGCGAATGG GACGCGCCCT GTAGCGGCGC ATTAAGCGCG GCGGGTGTGG TGGTTACGCG 60 

CAGCGTGACC GCTACACTTG CCAGCGCCCT AGCGCCCGCT CCTTTCGCTT TCTTCCCTTC 120 

CTTTCTCGCC ACGTTCGCCG GCTTTCCCCG TCAAGCTCTA AATCGGGGGC TCCCTTTAGG 180 

GTTCCGATTT AGTGCTTTAC GGCACCTCGA CCCCAAAAAA CTTGATTAGG GTGATGGTTC 24 0 

ACGTAGTGGG CCATCGCCCT GATAGACGGT TTTTCGCCCT TTGACGTTGG AGTCCACGTT 3 00 

CTTTAATAGT GGACTCTTGT TCCAAACTGG AACAACACTC AACCCTATCT CGGTCTATTC 3 60 

TTTTGATTTA TAAGGGATTT TGCCGATTTC GGCCTATTGG TTAAAAAATG AGCTGATTTA 42 0 

ACAAAAATTT AACGCGAATT TTAACAAAAT ATTAACGTTT ACAATTTCAG GTGGCACTTT 480 

TCGGGGAAAT GTGCGCGGAA CCCCTATTTG TTTATTTTTC TAAATACATT CAAATATGTA 54 0 

TCC3CTCATG AATTAATTCT TAGAAAAACT CATCGAGCAT CAAATGAAAC TGCAATTTAT 60 0 

rCATATCAGG ATTATCAATA CCATATTTTT GAAAAAGCCG TTTCTGTAAT GAAGGAGAAA 660 

ACTCACCGAG GCAGTTCCAT AGGATGGCAA GATCCTGGTA TCGGTCTGCG ATTCCGACTC 72 0 

GTCCAACATC AATACAACCT ATTAATTTCC CCTCGTCAAA AATAAGGTTA TCAAGTGAGA 78 0 

AATCACCATG AGTGACGACT GAATCCGGTG AGAATGGCAA AAGTTTATGC ATTTCTTTCC 84 0 

AGACTTGTTC AACAGGCCAG CCATTACGCT CGTCATCAAA ATCACTCGCA TCAACCAAAC 90 0 

CGTTATTCAT TCGTGATTGC GCCTGAGCGA GACGAAATAC GCGATCGCTG TTAAAAGGAC 96 0 

.AATTACAAAC AGGAATCGAA TGCAACCGGC GCAGGAACAC TGCCAGCGCA TCAACAATAT 102 0 

TTTCACCTGA ATCAGGATAT TCTTCTAATA CCTGGAATGC TGTTTTCCCG GGGATCGCAG 108 0 
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TGGTGAGTAA CCATGCATCA TCAGGAGTAC GGATAAAATG CTTGATGGTC GGAAGAGGCA 114 0 

TAAATTCCGT CAGCCAGTTT AGTCTGACCA TCTCATCTGT AACATCATTG GCAACGCTAC 1200 

CTTTGCCATG TTTCAGAAAC AACTCTGGCG CATCGGGCTT CCCATACAAT CGATAGATTG 1260 

TCGCACCTGA TTGCCCGACA TTATCGCGAG CCCATTTATA CCCATATAAA TCAGCATCCA 1320 

TGTTGGAATT TAATCGCGGC CTAGAGCAAG ACGTTTCCCG TTGAATATGG CTCATAACAC 1380 

CCCTTGTATT ACTGTTTATG TAAGCAGACA GTTTTATTGT TCATGACCAA AATCCCTTAA 144 0 

CGTGAGTTTT CGTTCCACTG AGCGTCAGAC CCCGTAGAAA AGATCAAAGG ATCTTCTTGA 150 0 

GATCCTTTTT TTCTGCGCGT AATCTGCTGC TTGCAAACAA AAAAACCACC GCTACCAGCG 1560 

GTGGTTTGTT TGCCGGATCA AGAGCTACCA ACTCTTTTTC CGAAGGTAAC TGGCTTCAGC 162 0 

AGAGCGCAGA TACCAAATAC TGTCCTTCTA GTGTAGCCGT AGTTAGGCCA CCACTTCAAG 168 0 

AACTCTGTAG CACCGCCTAC ATACCTCGCT CTGCTAATCC TGTTACCAGT GGCTGCTGCC 1740 

AGTGGCGATA AGTCGTGTCT TACCGGGTTG GACTCAAGAC GATAGTTACC GGATAAGGCG 1800 

CAGCGGTCGG GCTGAACGGG GGGTTCGTGC ACACAGCCCA GCTTGGAGCG AACGACCTAC 1860 

ACCGAACTGA GATACCTACA GCGTGAGCTA TGAGAAAGCG CCACGCTTCC CGAAGGGAGA 1920 

AAGGCGGACA GGTATCCGGT AAGCGGCAGG GTCGGAACAG GAGAGCGCAC GAGGGAGCTT 1980 

CCAGGGGGAA ACGCCTGGTA TCTTTATAGT CCTGTCGGGT TTCGCCACCT CTGACTTGAG 2040 

CGTCGATTTT TGTGATGCTC GTCAGGGGGG CGGAGCCTAT GGAAAAACGC CAGCAACGCG 210 0 

GCCTTTTTAC GGTTCCTGGC CTTTTGCTGG CCTTTTGCTC ACATGTTCTT TCCTGCGTTA 2160 

rcCCCTGATT CTGTGGATAA CCGTATTACC GCCTTTGAGT GAGCTGATAC CGCTCGCCGC 222 0 

AGCCGAACGA CCGAGCGCAG CGAGTCAGTG AGCGAGGAAG CGGAAGAGCG CCTGATGCGG 22 80 

TATTTTCTCC TTACGCATCT GTGCGGTATT TCACACCGCA TATATGGTGC ACTCTCAGTA 2 34 0 

CAATCTGCTC TGATGCCGCA TAGTTAAGCC AGTATACACT CCGCTATCGC TACGTGACTG 24 0 0 

GGTCATGGCT GCGCCCCGAC ACCCGCCAAC ACCCGCTGAC GCGCCCTGAC GGGCTTGTCT 2460 

GCTCCCGGCA TCCGCTTACA GACAAGCTGT GACCGTCTCC GGGAGCTGCA TGTGTCAGAG 2 520 

GTTTTCACCG TCATCACCGA AACGCGCGAG GCAGCTGCGG TAAAGCTCAT CAGCGTGGTC 2 580 

GTGAAGCGAT TCACAGATGT CTGCCTGTTC ATCCGCGTCC AGCTCGTTGA GTTTCTCCAG 2 64 0 

-\AGCGTTAAT GTCTGGCTTC TGATAAAGCG GGCCATGTTA AGGGCGGTTT TTTCCTGTTT 2 70 0 
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GGTCACTGAT GCCTCCGTGT AAGGGGGATT TCTGTTCATG GGGGTAATGA TACCGATGAA 27 60 

ACGAGAGAGG ATGCTCACGA TACGGGTTAC TGATGATGAA CATGCCCGGT TACTGGAACG 2820 

TTGTGAGGGT AAACAACTGG CGGTATGGAT GCGGCGGGAC CAGAGAAAAA TCACTCAGGG 2880 

TCAATGCCAG CGCTTCGTTA ATACAGATGT AGGTGTTCCA CAGGGTAGCC AGCAGCATCC 294 0 

TGCGATGCAG ATCCGGAACA TAATGGTGCA GGGCGCTGAC TTCCGCGTTT CCAGACTTTA 3 000 

CGAAACACGG AAACCGAAGA CCATTCATGT TGTTGCTCAG GTCGCAGACG TTTTGCAGCA 3 060 

GCAGTCGCTT CACGTTCGCT CGCGTATCGG TGATTCATTC TGCTAACCAG TAAGGCAACC 312 0 

CCGCCAGCCT AGCCGGGTCC TCAACQACAG GAGCACGATC ATGCGCACCC GTGGGGCCGC 3180 

CATGCCGGCG ATAATGGCCT GCTTCTCGCC GAAACGTTTG GTGGCGGGAC CAGTGACGAA 324 0 

GGCTTGAGCG AGGGCGTGCA AGATTCCGAA TACCGCAAGC GACAGGCCGA TCATCGTCGC 3300 

GCTCCAGCGA AAGCGGTCCT CGCCGAAAAT GACCCAGAGC GCTGCCGGCA CCTGTCCTAC 3 3 60 

GAGTTGCATG ATAAAGAAGA CAGTCATAAG TGCGGCGACG ATAGTCATGC CCCGCGCCCA 342 0 

CCGGAAGGAG CTGACTGGGT TGAAGGCTCT CAAGGGCATC GGTCGAGATC CCGGTGCCTA 34 8 0 

ATGAGTGAGC TAACTTACAT TAATTGCGTT GCGCTCACTG CCCGCTTTCC AGTCGGGAAA 354 0 

CCTGTCGTGC CAGCTGCATT AATGAATCGG CCAACGCGCG GGGAGAGGCG GTTTGCGTAT 360 0 

TGGGCGCCAG GGTGGTTTTT CTTTTCACGA GTGAGACGGG CAACAGCTGA TTGCCCTTCA 3 660 

CCGCCTGGCC CTGAGAGAGT TGCAGCAAGC GGTCCACGCT GGTTTGCCCC AGCAGGCGAA 3 72 0 

AATCCTGTTT GATGGTGGTT .^CGGCGGGA TATAACATGA GCTGTCTTCG GTATCGTCGT 3780 

ATCCCACTAC CGAGATATCC GCACCAACGC GCAGCCCGGA CTCGGTAATG GCGCGCATTG 3840 

CGCZZAGCGC CATCTGATCG TTGGCAACCA GCATCGCAGT GGGAACGATG CCCTCATTCA 3 900 

GCATTTGCAT GGTTTGTTGA AAACCGGAC\ TGGCACTCCA GTCGCCTTCC CGTTCCGC7A 3 960 

TCGGCTGAAT TTGATTGCGA GTGAGATATT TATGCCAGCC AGCCAGACGC AGACGCGCCG 4 020 

AGACAGAACT TAATGGGCCC GCTAACAGCG CGATTTGCTG GTGACCCAAT GCGACCAGAT 4 08 0 

GCTCCACGCC CAGTCGCGTA CCGTCTTCAT GGGAGAAAAT AATACTGTTG ATGGGTGTCT 414 0 

GGTCAGAGAC ATCAAGAAAT AACGCCGGAA CATTAGTGCA GGCAGCTTCC ACAGCAATGG 42 0 0 

CATCCTGGTC ATCCAGCGGA 7AGTTAATGA TCAGCCCACT GACGCGTTGC GCGAGAAGAT 42 60 

TGTGCACCGC CGCTTTACAG GCTTCGACGC CGCTTCGTTC TACCATCGAC ACCACCACGC 43 20 

TGGCACCCAG TTGATCGGCG CGAGATTTAA TCGCCGCGAC AATTTGCGAC GGCGCGTGCA 43 9 0 
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GGGCCAGACT GGAGGTGGCA ACGCCAATCA GCAACGACTG TTTGCCCGCC AGTTGTTGTG 4440 

CCACGCGGTT GGGAATGTAA TTCAGCTCCG CCATCGCCGC TTCCACTTTT TCCCGCGTTT 4500 

TCGCAGAAAC GTGGCTGGCC TGGTTCACCA CGCGGGAAAC GGTCTGATAA GAGACACCGG 4560 

CATACTCTGC GACATCGTAT AACGTTACTG GTTTCACATT CACCACCCTG AATTGACTCT 462 0 

CTTCCGGGCG CTATCATGCC ATACCGCGAA AGGTTTTGCG CCATTCGATG GTGTCCGGGA 468 0 

TCTCGACGCT CTCCCTTATG CGACTCCTGC ATTAGGAAGC AGCCCAGTAG TAGGTTGAGG 474 0 

CCGTTGAGCA CCGCCGCCGC AAGGAATGGT GCATGCAAGG AGATGGCGCC CAACAGTCCC 480 0 

CCGGCCACGG GGCCTGCCAC CATACCCACG CCGAAACAAG CGCTCATGAG CCCGAAGTGG 4860 

CGAGCCCGAT CTTCCCCATC GGTGATGTCG GCGATATAGG CGCCAGCAAC CGCACC7GTG 4 92 0 

GCGCCGGTGA TGCCGGCCAC GATGCGTCCG GCGTAGAGGA TCGAGATCTC GATCCCGCGA 4 98 0 

AATTAATACG ACTCACTATA GGGGAATTGT GAGCGGATAA CAATTCCCCT CTAGAAATAA 504 0 

TTTTGTTTAA CTTTAAGAAG GAGATATACA TATGGGCCAT CATCATCATC ATCACGTGAT 5100 

CGACATCATC GGGACCAGCC CCACATCCTG GGAACAGGCG GCGGCGGAGG CGGTCCAGCG 5160 

GGCGCGGGAT AGCGTCGATG ACATCCGCGT CGCTCGGGTC ATTGAGCAGG ACATGGCCGT 5220 

GGACAGC3CC GGCAAGATCA CCTACCGCAT CAAGCTCGAA GTGTCGTTCA AGATGAGGCC 5280 

GGCGCAACCG AGGGGCTCGA AACCACCGAG CGGTTCGCCT GAAACGGGCG CCGGCGCCGG 5340 

TACTGTCGCG ACTACCCCCG CGTCGTCGCC GGTGACGTTG GCGGAGACCG GTAGCACGCT 54 0 0 

GCTCTACCCG CTGTTCAACC TGTGGGGTCC GGCCTTTCAC GAGAGGTATC GGAACGTCAC 54 60 

GATCACCGCT CAGGGCACCG GTTCTGGTGC CGGGATCGCG CAGGCCGCCG CCGGGACGGT 552 0 

CAACATTGGG GCCTCCGACG CCTATCTGTC GGAAGGTGAT ATGGCCGCGC ACAAGGGGCT 55 8 0 

GATGAACATC GCGCTAGCCA TCTCCGCTCA GCAGGTCAAC TACAACCTGC CCGGAGTGAG 564 0 

CGAGCACCTC AAGCTGAACG GAAAAGTCCT GGCGGCCATG TACCAGGGCA CCATCAAAAC 57 00 

CTGGGACGAC CCGCAGATCG CTGCGCTCAA CCCCGGCGTG AACCTGCCCG GCACCGCGGT 5760 

AGTTCCGCTG CACCGCTCCG ACGGGTCCGG TGACACCTTC TTGTTCACCC AGTACCTGTC 5820 

CAAGCAAGAT CCCGAGGGCT GGGGCAAGTC GCCCGGCTTC GGCACCACCG TCGACTTCCC 58 80 

GGCGGTGCCG GGTGCGCTGG GTGAGAACGG CAACGGCGGC ATGGTGACCG GTTGCGCCGA 5940 

GACACCGGGC TGCGTGGCCT ATATCGGCAT CAGCTTCCTC GACCAGGCCA GTCAACGGGG 60 0 0 
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ACTCGGCGAG GCCCAACTAG GCAATAGCTC TGGCAATTTC TTGTTGCCCG ACGCGCAAAG 6060 

CATTCAGGCC GCGGCGGCTG GCTTCGCATC GAAAACCCCG GCGAACCAGG CGATTTCGAT 6120 

GATCGACGGG CCCGCCCCGG ACGGCTACCC GATCATCAAC TACGAGTACG CCATCGTCAA 6180 

CAACCGGCAA AAGGACGCCG CCACCGCGCA GACCTTGCAG GCATTTCTGC ACTGGGCGAT 6240 

CACCGACGGC AACAAGGCCT CGTTCCTCGA CCAGGTTCAT TTCCAGCCGC TGCCGCCCGC 6300 

GGTGGTGAAG TTGTCTGACG CGTTGATCGC GACGATTTCC AGCGCTGAGA TGAAGACCGA 63 60 

TGCCGCTACC CTCGCGCAGG AGGCAGGTAA TTTCGAGCGG ATCTCCGGCG ACCTGAAAAC 6420 

GCAGATCGAC CAGGTGGAGT CGACGGCAGG TTCGTTGCAG GGCCAGTGGC GCGGCGCGGC 64 80 

GGGGACGGCC GCCCAGGCCG CGGTGGTGCG CTTCCAAGAA GCAGCCAATA AGCAGAAGCA 6540 

GGAACTCGAC GAGATCTCGA CGAATATTCG TCAGGCCGGC GTCCAATACT CGAGGGCCGA 660 0 

CGAGGAGCAG CAGCAGGCGC TGTCCTCGCA AATGGGCTTT GTGCCCACAA CGGCCGCCTC 6 660 

GCCGCCGTCG ACCGCTGCAG CGCCACCCGC ACCGGCGACA CCTGTTGCCC CCCCACCACC 6720 

GGCCGCCGCC AACACGCCGA ATGCCCAGCC GGGCGATCCC AACGCAGCAC CTCCGCCGGC 67 80 

CGACCCGAAC GCACCGCCGC CACCTGTCAT TGCCCCAAAC GCACCCCAAC CTGTCCGGAT 6840 

CGACAACCCG GTTGGAGGAT TCAGCTTCGC GCTGCCTGCT GGCTGGGTGG AGTCTGACGC 6900 

CGCCCACTTC GACTACGGTT CAGCACTCCT CAGCAAAACC ACCGGGGACC CGCCATTTCC 6 96 0 

CGGACAGCCG CCGCGGGTGG CCAATGACAC CCGTATCGTG CTCGGCCGGC TAGACCAAAA 7020 

3CTTTACGCC AGCGCCGAAG CCACCGACTC CAAGGCCGCG GCCCGGTTGG GCTCGGACAT 70 8 0 

GGGTGAGTTC TATATGCCCT ACCCGGGCAC CCGGATCAAC CAGGAAACCG TCTCGCTTGA 7140 

CGCCAACGGG GTGTCTGGAA GCGCGTCGTA TTACGAAGTC AAGTTCAGCG ATCCGAGTAA 72 00 

GCCGAACGGC CAGATCTGGA CGGGCGTAAT CGGCTCGCCC GCGGCGAACG CACCGGACGC 7260 

CGGGCCCCCT CAGCGCTGGT TTGTGGTATG GCTCGGGACC GCCAACAACC CGGTGGACAA 73 20 

GGGCGCGGCC AAGGCGCTGG CCGAATCGAT CCGGCCTTTG GTCGCCCCGC CGCCGGCGCC 7 3 80 

GGCACCGGCT CCTGCAGAGC CCGCTCCGGC GCCGGCGCCG GCCGGGGAAG TCGCTCCTAC 744 0 

CCCGACGACA CCGACACCGC AGCGGACCTT ACCGGCCTGA GAATTCTGCA GATATCCATC 7500 

ACACTGGCGG CCGCTCGAGC ACCACCACCA CCACCACTGA GATCCGGCTG CTAACAAAGC 7 5 60 

CCGAAAGGAA GCTGAGTTGG CTGCTGCCAC CGCTGAGCAA TAACTAGCAT AACCCCTTGG 7 620 

GGCCTCTAAA CGGGTCTTGA GGGGTrTTTT GCTGAAAGGA GGAACTATAT CCGGAT 7676 
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(2) INFORMATION FOR SEQ ID NO: 20 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 802 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 209: 

Met Gly His His His His His His Val lie Asp lie He Gly Thr Ser 

15 10 15 

Pro Thr Ser Trp Glu Gin Ala Ala Ala Glu Ala Val Gin Arg Ala Arg 
20 25 30 

Asp Ser Val Asp Asp He Arg Val Ala Arg Val He Glu Gin Asp Met 
35 40 45 



Ala Val Asp Ser Ala Gly Lys He 
50 55 

Ser Phe Lys Met Arg Pro Ala Gin 
65 70 



Thr Tyr Arg He Lys Leu Glu Val 
60 

Pro Arg Gly Ser Lys Pro Pro Ser 
75 80 



Gly Ser Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro 
95 90 95 

Ala Ser Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr 
100 105 110 

Pro Leu Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn 
115 120 125 

Val Thr lie Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly He Ala Gin 
130 135 140 

Ala Ala Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser 
145 150 155 160 

Glu Gly Asp Met Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala 
165 170 175 

He Ser Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His 
180 185 190 

Leu Lys Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr He 
195 200 205 

Lys Thr Trp Asp Asp Pro Gin He Ala Ala Leu Asn Pro Gly Val Asn 
210 215 220 

Leu Pro Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly 
Z2S 230 235 240 



wo 99/42118 



153 



PCT/US99/03265 



Asp Thr Phe Leu Phe Tlir Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly 
245 250 255 

Trp Gly Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val 
260 265 270 

Pro Gly Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys 
275 280 285 

Ala Glu Thr Pro Gly Cys Val Ala Tyr lie Gly He Ser Phe Leu Asp 
290 295 300 

Gin Ala Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser 
305 310 315 320 

Gly Asn Phe Leu Leu Pro Asp Ala Gin Ser He Gin Ala Ala Ala Ala 
325 330 335 

Gly Phe Ala Ser Lys Thr Pro Ala Asn Gin Ala He Ser Met He Asp 
340 345 350 

Gly Pro Ala Pro Asp Gly Tyr Pro He He Asn Tyr Glu Tyr Ala He 
355 360 365 

Val Asn Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala 
370 375 380 

Phe Leu His Trp Ala He Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp 
385 390 395 400 

Gin Val His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp 
405 410 415 

Ala Leu He Ala Thr He Ser Ser Ala Glu Mec Lys Thr Asp Ala Ala 
420 425 430 



Thr Leu Ala Gin Glu Ala Gly Asn Phe Glu Arg He Ser Gly Asp Leu 
435 440 445 

Lys Thr Gin He Asp Gin Val Glu Ser Thr Ala Gly Ser Leu Gin Gly 
450 455 460 

Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin Ala Ala Val Val Arg 
465 470 475 480 

Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu Leu Asp Glu He Ser 
485 490 495 

Thr Asn He Arg Gin Ala Gly Val Gin Tyr Ser Arg Ala Asp Glu Glu 
500 505 510 



Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe Val Pro Thr Thr Ala 
515 520 525 
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Ala Ser Pro Pro Ser Thr Ala Ala Ala Pro Pro Ala Pro Ala Thr Pro 
530 535 540 

Val Ala Pro Pro Pro Pro Ala Ala Ala Asn Thr Pro Asn Ala Gin Pro 
545 550 555 560 

Gly Asp Pro Asa Ala Ala Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro 
565 570 575 

Pro Pro Val lie Ala Pro Asn Ala Pro Gin Pro Val Arg lie Asp Asn 
580 585 590 

Pro Val Gly Gly Phe Ser Phe Ala Leu Pro Ala Gly Trp Val Glu Ser 
595 600 605 

Asp Ala Ala His Phe Asp Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr 
610 615 620 

Gly Asp Pro Pro Phe Pro Gly Gin Pro Pro Pro Val Ala Asn Asp Thr 
625 630 635 640 

Arg lie Val Leu Gly Arg Leu Asp Gin Lys Leu Tyr Ala Ser Ala Glu 
645 650 655 

Ala Thr Asp Ser Lys Ala Ala Ala Arg Leu Gly Ser Asp Met Gly Glu 
660 665 670 

Phe Tyr Met Pro Tyr Pro Gly Thr Arg He Asn Gin Glu Thr Val Ser 
675 680 685 

Leu Asp Ala Asn Gly Val Ser Gly Ser Ala Ser Tyr Tyr Glu Val Lys 
690 695 700 

Phe Ser Asp Pro Ser Lys Pro Asn Gly Gin He Trp Thr Gly Val He 
705 710 715 720 

Gly Ser Pro Ala Ala Asn Ala Pro Asp Ala Gly Pro Pro Gin Arg Trp 
725 730 735 

Phe Val Val Trp Leu Gly Thr Ala Asn Asn Pro Val Asp Lys Gly Ala 
740 745 750 

Ala Lys Ala Leu Ala Glu Ser He Arg Pro Leu Val Ala Pro Pro Pro 
755 760 765 

Ala Pro Ala Pro Ala Pro Ala Glu Pro Ala Pro Ala Pro Ala Pro Ala 
770 775 780 

Gly Glu Val Ala Pro Thr Pro Thr Thr Pro Thr Pro Gin Arg Thr Leu 
785 790 795 800 



Pro Ala 



) INFORMATION FOR SEQ ID NO: 210: 
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(i) SEQUENCE CHARACTERISTICS: 
{A} LENGTH: 454 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 10: 



GTGGCGGCGC 


TGCGGCCGGC 


CAGCAGAGCG 


ATGTGCATCC 


GTTCGCGAAC 


CTGATCGCGG 


60 


TCGACGATGA 


GCGCGCCGAA 


CGCCGCGACG 


ACGAAGAACG 


TCAGGAAGCC 


GTCCAGCAGC 


120 


GCGGTCCGCG 


CGGTGACGAA 


GCTGACCCCG 


TCGCAGATCA 


GCAGCACCCC 


GGCGATGGCG 


180 


CCGACCAATG 


TCGACCGGCT 


GATCCGCCGC 


ACGATCCGCA 


CCACCAGCGC 


CACCAGGACC 


240 


ACACCCAGCA 


GGGCGCCGGT 


GAACCGCCAG 


CCGAATCCGT 


TGTGACCGAA 


GATGGCCTCC 


300 


CCGATCGCGA 


TCAGCTGCTT 


ACCGACCGGC 


GGGTGAACCA 


CCAGGCCGTA 


CCCGGGGTTG 


360 


TCTTCCACCC 


CATGGTTGTT 


CAGCACCTGC 


CAGGCCTGGC 


GGTGCGTAAT 


GCTTCTCGTC 


420 


GAAGATGGGG 


GTGCCGGCAT 


CCGTCACCGA 


GCCC 






454 



(2) INFORMATION FOR SEQ ID NO: 211: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 470 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 

TGCAGAAGTA CGGCGGATCC TCGGTGGCCG ACGCCGAACG GATTCGCCGC GTCGCCGAAC 50 

GCATCGTCGC CACCAAGAAG CAAGGCAATG ACGTCGTCGT CGTCGTCTCT GCCATGGGGG 120 

ATACCACCGA CGACCTGCTG GATCTGGCTC AGCAGGTGTG CCCGGCGCCG CCGCCTCGGG 180 

AGCTGGACAT GCTGCTTACC GCCGGTGAAC GCATCTCGAA TGCGTTGGTG GCCATGGCCA 24 0 

TCGAGTCGCT CGGCGCGCAT GCCCGGTCGT TCACCGGTTC GCAGGCCGGG GTGATCACCA 3 00 

CCGGCACCCA CGGCAACGCC AAGATCATCG ACGTCACGCC GGGGCGGCTG CAAACCGCCC 360 

TTGAGGAAGG GCGGGTC3TC TTGGTGGCCG GATTCCAAGG GGTCAGCCAG OACACCAAGG 420 

ATGTCACGAC GTTGGGCCGC GGCGGCTCGG ACACCACCGC CGTCGCCATG 4 70 

(2) INFORMATION FOR SEQ ID NO: 2 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 279 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 12: 

GGCCGGCZTA CCCGGCCGGG ACAAACAACG ATCGATTGAT ATCGATGAGA GACGGAGGAA 60 
TC3TGGCCCT TCCCCAGTTG ACCGACGAGC AGCGCGCGGC CGCGTTGGAG AAGGCTGCTG 12 0 
CCGCACGTCG AGCGCGAGCA 3AGCTCAAGG ATCGGCTCAA CCGTGGCGGC ACCAACCTCA 18 0 
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CCCAGGTCCT CAAGGACGCG GAGAGCGATG AAGTCTTGGG CAAAATGAAG GTGTCTGCGC 240 
TGCTTGAGGC CTTGCCAAAG GTGGGCAAGG TCCAGGCGC 279 

(2) INFORMATION FOR SEQ ID NO: 2 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 219 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 213: 

ACACGGTCGA ACTCGACGAG CCCCTCGTGG AGGTGTCGAC CGACAAGGTC GACACCGAAA 60 

TCCCTCGCCG GCCGCGGGTG TGCTGACCAA GATCATCGCC CAAGAAGATG ACACGGTCGA 12 0 

GGTCGGCGGC GAGCTCTCTG TCATTGGCGA CGCCCATGAT GCCGGCGAGG CCGCGGTCCC 180 

GGCACCCCAG AAAGTCTCTG CCGGCCCAAC CCGAATCCA 219 

(2) INFORMATION FOR SEQ ID NO: 2 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 342 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:214: 

TCGCTGCCGA CATCGGCGCC GCGCCCGCCC CCAAGCCCGC ACCCAAGCCC GTCCCCGAGC 60 

CAGCGCCGAC GCCGAAGGCC GAACCCGCAC CATCGCCGCC GGCGGCCCAG CCAGCCGGTG 12 0 

CGGCCGAGGG CGCACCGTAC GTGACGCCGC TGGTGCGAAA GCTGGCGTCG GAAAACAACA 180 

TCGACCTCGC CGGGGTGACC GGCACCGGAG TGGGTGGTCG CATCCGCAAA CAGGATGTGC 24 0 

TGGCCGCGGC TGAACAAAAG AAGCGGGCGA AAGCACCGGC GCCGGCCGCC CAGGCCGCCG 3 00 

CCGCGCCGGC CCCGAAAGCG CCGCCTGAAG ATCCGATGCC GC 342 

(2) INFORMATION FOR SEQ ID NO: 215: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 515 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO:215: 

GGGTCTTGGT CAGTATCAGC GCCGACGAGG ACGCCACGGT GCCCGTCGGC GGCGAGTTGG 60 

CCCGGATCGG TGTCGCTGCC GACATCGGCG CCGCGCCCGC CCCCAAGCCC GCACCCAAGC 12 0 

CCGTCCCCGA GCZAGCGCCG ACGCC3AAGG CCGAACCCGC ACCATCGCCG CCGGCGGCCC 18 0 

AGCCAGCCGG TGCGGCCGAG GGCGCACCGT ACGTGACGCC GCTGGTGCGA AAGCTGGCGT 24 0 
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CGGAAAACAA CATCGACCTC GCCGGGGTGA CCGGCACCGG AGTGGGTGGT CGCATCCGCA 3 00 

AACAGGATGT GCTGGCCGCG GCTGAACAAA AGAAGCGGGC GAAAGCACCG GCGCCCTGAG 360 

CGCTTCATCA CCCGGTTAAC CAGCTTGCCC CAGAAGCCGG CTTCGACCTC TTCGCGGGTC 42 0 

TTGGTCCGCT GCAGGCGGTC GGCGAGCCAG TTCAGGTTAG GCGGCCGAAA TCTTCCAGTT 4 80 

CGCCAGGAAG GGCACCCGGA ACAGGGTCCG CACCC 515 

(2) INFORMATION FOR SEQ ID NO: 2 16: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 557 base pairs 

(B) TYPE: nucleic acid 

(C) STRANLEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 216: 

CCGACCCCAA GGTGCAGATT CAACAGGCCA TTGAGGAAGC ACAGCGCACC CACCAAGCGC 50 

TGACTCAACA GGCGGCGCAA GTGATCGGTA ACCAGCGTCA ATTGGAGATG CGACTCAACC 120 

GACAGCTGGC GGACATCGAA AAGCTTCAGG TCAATGTGCG CCAAGCCCTG ACGCTGGCCG 180 

ACCAGGCCAC CGCCGCCGGA GACGCTGCCA AGGCCACCGA ATACAACAAC GCCGCCGAGG 24 0 

CGTTCGCAGC CCAGCTGGTG ACCGCCGAGC AGAGCGTCGA AGACCTCAAG ACGCTGCATG 3 00 

ACCAGGCGCT TAGCGCCGCA GCTCAGGCC\ AGAAGGCCGT CGAACGAAAT GCGATGGTGC 360 

TGCAGCAGAA GATCGCCGAG CGAACCAAGC TGCTCAGCCA GCTCGAGCAG GCGAAGATGC 420 

AGGAGCAGGT CAGCGCATCG TTGCGGTCGA TGAGTGAGCT CGCCGCGCCA GGCAACACGC 480 

CGAGCCTCGA CGAGGTGCGC GACAAGATCG AGCGTCGCTA CGCCAACGCG ATCGGTTCGG 540 

CTGAACTTGC CGAGAGT 557 

(2) INFORMATION FOR SEQ ID NO: 217: 



SEQUENCE CHARACTERISTICS: 
;A) LENGTH: 223 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(XI } SEQUENCE DESCRIPTION: SEQ ID NO: 217: 



CAGGATAGGT TTCGACATCC ACCTGGGTTC CGCACCCGGT GCGCGACCGT GTGATAGGCC 6 0 

AGAGGTGGAC CTGCGCCGAC CGACGATCGA TCGAGGAGTC AACAGAAATG GCCTTCTCCG 12 0 

TCCAGATGCC GGCACTCGGT GAGAGCGTCA CCGAGGGGAC GGTTACCCGC TGGCTCAAAC 180 

AGGAAGGCGA CACGGTCGAA CTCGACGAGC CCCTCGTGGA GGT 223 



(2} INFORMATION FOR SEQ ID NO: 218: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 578 base paars 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:218: 



AAGAAGTACA TCTGCCGGTC GATGTCGGCG AACCACGGCA GCCAACCGGC GCAGTAGCCG 60 

ACCAGGACCA CCGCATAACG CCAGTCCCGG CGCACAAACA TACGCCACCC CGCGTATGCC 120 

AGGACTGGCA CCGCCAGCCA CCACATCGCG GGCGTGCCGA CCAGCATCTC GGCCTTGACG 180 

CACGACTGTG CGCCGCAGCC TGCAACGTCT TGCTGGTCGA TGGCGTACAG CACCGGCCGC 240 

AACGACATGG GCCAGGTCCA CGGTTTGGAT TCCCAAGGGT GGTAGTTGCC TGCGGAATTC 300 

GTCAGGCCCG CGTGGAAGTG GAACGCTTTG GCGGTGTATT GCCAGAGCGA GCGCACGGCG 360 

TCGGGCAGCG GAACAACCGA GTTGCGACCG ACCGCTTGAC CGACCGCATG CCGATCGATC 420 

GCGGTCTCGG ACGCGAACCA CGGAGCGTAG GTGGCCAGAT AGACCGCGAA CGGGATCAAC 480 

CCCAGCGCAT ACCCGCTGGG AAGCACGTCA CGCCGCACTG TTCCCAGCCA CGGTCTTTGC 540 

ACTTGGTATG AACGTCGCGC CGCCACGTCA ACGCCAGC 578 



(2) INFORMATION FOR SEQ ID NO: 219: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 484 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Genomic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219: 



ACAACGATCG ATTGATATCG ATGAGAGACG 
ACGAGCAGCG CGCGGCCGCG TTGGAGAAGG 
TCAAGGATCG GCTCAAGCGT GGCGGCACCA 
GCGATGAAGT CTTGGGCAAA ATGAAGGTGT 
GCAAGGTCAA GGCGCAGGAG ATCATGACCG 
rCGTGGCCTC GGTGACCGTC AGCGCAAGGC 
CCGCCGGCCG ACGATGCGGG CCGGAAGGCC 
GAAGCGGCCT GACAGGGCCA GCTCACAATT 
GCCC 



GAGGAATCGT GGCCCTTCCC CAGTTGACCG 60 

CTGCTGCCGC ACGTCGAGCG CGAGCAGAGC 120 

ACCTCACCCA GGTCCTCAAG GACGCGGAGA 180 

CTGCGCTGCT TGAGGCCTTG CCAAAGGTGG 240 

AGCTGGAAAT TGCGCCCCAC CCCGCCGCCT 3 00 

CCTGCTGGAA AAGTTCGGCT CCGCCTAACC 360 

TGTGGTGGGC GTACCCCCGC ATACGGGGGA 420 

CAGGCCGAAC GCCCCGGTGG GGGGGAACCC 4 80 

484 



(2) INFORMATION FOR SEQ ID NO: 220: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 537 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Genomic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220: 



AGGACTGGCA CCGCCAGCCA CCACATCGCG GGCGTGCCGA CCAGCATCTC GGCCTTGACG 60 

CACGACTGTG CGCCGCAGCC TGCAACGTCT TGCTGGTCGA TGGCGTACAG CACCGGCCGC 120 

AACGACATGG GCCAGGTCCA CGGTTTGGAT TCCCAAGGGT GGTAGTTGCC TGCGGAATTC 180 

GTCAGGCCCG CGTGGAAGTG GAACGCTTTG GCGGTGTAGT GCCAGAGCGA GCGCACGGCG 240 

TCGGGCAGCG GAACAACCGA GTTGCGACCG ACCGCTTGAC CGACCGCATG CCGATCGATC 300 

GCGGTCTCGG ACGCGAACCA CGGAGCGTAG GTGGCCAGAT AGACCGCGAA CGGGATCAAC 360 

CCCAGCGCAT ACCCGCTGGG AAGCACGTCA CGCCGCACTG TCCCCAGCCA CGGTCTTTGC 42 0 
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ACTTGGTACT GACGTCGCGC CGCCACGTCG AACGCCAGCG CCATCGCGCC GAAGAACAGC 480 
ACGAAGTACA CGCCGGACCA CTTGGTGGCG CAAGCCAATC CCAAGCAGCA CCCCGGC 537 

(2) INFORMATION FOR SEQ ID NO:221: 

(i) SEQUENCE CHARACTERISTICS: 

(A) liENGTH: 135 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221: 



Gly 


Gly 


Ala 


Ala 


Ala 


Gly 


Gin 


Gin 


Ser 


Asp 


Val His Pro 


Phe 


Ala 


Asn 










5 










10 






15 




Leu 


lie 


Ala 


Val 

20 


Asp 


Asp 


Glu 


Arg 


Ala 
25 


Glu 


Arg Arg Asp 


Asp 
30 


Glu 


Glu 


Arg 


Gin 


Glu 
35 


Ala 


Val 


Gin 


Gin 


Arg 
40 


Gly 


Pro 


Arg Gly Asp 
45 


Glu 


Ala 


Asp 


Pro 


Val 


Ala 


Asp 


Gin 


Gin 


His 


Pro 


Gly Asp 


Gly Ala Asp 


Gin 


Cys 


Arg 




50 










55 








60 








Pro 


Ala 


Asp 


Pro 


Pro 


His 


Asp 


Pro 


His 


His 


Gin Arg His 


Gin 


Asp 


His 


65 










70 










75 






80 


Thr 


Gin 


Gin 


Gly 


Ala 
85 


Gly 


Glu 


Pro 


Pro 


Ala 
90 


Glu Ser Val 


Val 


Thr 
95 


Glu 


Asp 


Gly 


Leu 


Pro 
100 


Asp 


Arg 


Asp 


Gin 


Leu 
105 


Leu 


Thr Asp Arg 


Arg 

110 


Val 


Asn 


His 


Gin 


Ala 

115 


Val 


Pro 


Gly 


Val 


Val 
120 


Phe 


His 


Pro Met Val 
125 


val 


Gin 


His 


Leu 


Pro 
130 


Gly 


Leu 


Ala 


Vai 


Arg 
135 

















;2: INFORMATION FOR SEQ ID NO: 222: 

(i) SEQUENCE CHARACTERISTICS; 
(A) LENGTH: 15 6 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQXraiCE DESCRIPTION: SEQ ID NO: 222: 

Gin Lys Tyr Gly Gly Ser Ser Val Ala Asp Ala Glu Arg He Arg Arg 

^5 10 15 

Val Ala Glu Arg He Val Ala Thr Lys Lys Gin Gly Asn Asp Val Val 

20 25 30 

Val Val Val Ser Ala Met Gly Asp Thr Thr Asp Asp Leu Leu Asp Leu 

35 40 45 

Ala Gin Gin Val C/s Pro Ala Pro Pro Pro Arg Glu Leu Asp Met Leu 

50 55 60 

Leu Thr Ala Gly Glu Arg He Ser Asn Ala Leu Val Ala Met Ala He 
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65 70 75 80 

Glu Ser Leu Gly Ala His Ala Arg Ser Phe Thr Gly Ser Gin Ala Gly 

85 90 95 

Val He Thr Thr Gly Thr His Gly Asn Ala Lys He He Asp Val Thr 

100 105 110 

Pro Gly Arg Leu Gin Thr Ala Leu Glu Glu Gly Arg Val Val Leu Val 

115 120 125 

Ala Gly Phe Gin Gly Val Ser Gin Asp Thr Lys Asp Val Thr Thr Leu 

130 135 140 

Gly Arg Gly Gly Ser Asp Thr Thr Ala Val Ala Met 
145 150 155 

(2) INFORMATION FOR SEQ ID NO: 223: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 92 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:223: 



Pro 


Ala 


Tyr 


Pro 


Ala 


Gly 


Thr 


Asn 


Asn 


Asp Arg 


Leu 


He 


Ser Met 


Arg 


1 








5 










10 






15 




Asp 


Gly 


Gly 


He 


Val 


Ala 


Leu 


Pro 


Gin 


Leu Thr 


Asp 


Glu 


Gin Arg Ala 








20 










25 








30 




Ala 


Ala 


Leu 


Glu 


Lys 


Ala 


Ala 


Ala 


Ala 


Arg Arg Ala Arg Ala Glu 


Leu 






35 










40 








45 






Lys 


Asp 


Arg 


Leu 


Lys 


Arg 


Gly 


Gly 


Thr 


Asn Leu 


Thr 


Gin 


Val Leu 


Lys 




50 










55 








60 








Asp 


Ala 


Glu 


Ser 


Asp 


Glu 


Val 


Leu 


Gly 


Lys Met 


Lys 


Val 


Ser Ala 


Leu 


65 










70 








75 








80 


Leu 


Glu 


Ala 


Leu 


Pro 


Lys 


Val 


Gly 


Lys 


Val Gin 


Ala 









85 90 
(2) INFORMATION FOR SEQ ID NO: 224: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 72 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 224: 

Thr Val Glu Leu Asp Glu Pro Leu Val Glu Val Ser Thr Asp Lys Val 

15 10 15 

Asp Thr Glu He Pro Ser Pro Ala Ala Gly Val Leu Thr Lys He He 

20 25 30 

Ala Gin Glu Asp Asp Thr Val Glu Val Gly Gly Glu Leu Ser Val He 

35 40 45 
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Gly Asp Ala His Asp Ala Gly Glu Ala Ala Val Pro Ala Pro Gin Lys 

50 55 60 

Val Ser Ala Gly Pro Thr Arg He 
S5 70 

(2) INFORMATION FOR SEQ ID NO: 225: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 225: 

Ala Ala Asp He Gly Ala Ala Pro Ala Pro Lys Pro Ala Pro Lys Pro 

5 10 15 

Val Pro Glu Pro Ala Pro Thr Pro Lys Ala Glu Pro Ala Pro Ser Pro 

20 25 30 

Pro Ala Ala Gin Pro Ala Gly Ala Ala Glu Gly Ala Pro Tyr Val Thr 

35 40 45 

Pro Leu Val Arg Lys Leu Ala Ser Glu Asn Asn He Asp Leu Ala Gly 

50 55 60 

Val Thr Gly Thr Gly Val Gly Gly Arg He Arg Lys Gin Asp Val Leu 
65 70 75 ao 

Ala Ala Ala Glu Gin Lys Lys Arg Ala Lys Ala Pro Ala Pro Ala Ala 

85 90 95 

Gin Ala Ala Ala Ala Pro Ala Pro Lys Ala Pro Pro Glu Asp Pro Met 
100 105 110 

Pro 



;2) INFORMATION FOR SEQ ID NO: 226: 

[1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 118 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 226: 

Val Leu Val Ser He Ser Ala Asp Glu Asp Ala Thr Val Pro Val Gly 

15 10 15 

Gly Glu Leu Ala Arg He Gly Val Ala Ala Asp He Gly Ala Ala Pro 

20 25 30 

Ala Pro Lys Pro Ala Pro Lys Pro Val Pro Glu Pro Ala Pro Thr Pro 

35 40 45 

Lys Ala Glu Pro Ala Pro Ser Pro Pro Ala Ala Gin Pro Ala Gly Ala 

50 55 60 

Ala Glu Gly Ala Pro Tyr Val Thr Pro Leu Val Arg Lys Leu Ala Ser 
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65 70 
Glu Asn Asn lie Asp Leu Ala Gly 
85 

Arg lie Arg Lys Gin Asp Val Leu 

100 

Ala Lys Ala Pro Ala Pro 
115 



75 80 
Val Thr Gly Thr Gly Val Gly Gly 

90 95 
Ala Ala Ala Glu Gin Lys Lys Arg 
105 110 



(2) INFORMATION FOR SEQ ID NO: 227: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 185 amino acids 

(B) TYPE: amino acid 

(C) STRAZJDEDNESS : single 

(D) TOPOLOGY: linear 

Cii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:227: 



Asp Pro Lys Val Gin lie Gin Gin Ala He Glu Glu Ala Gin Arg Thr 

15 10 15 

His Gin Ala Leu Thr Gin Gin Ala Ala Gin Val He Gly Asn Gin Arg 

20 25 30 

Gin Leu Glu Met Arg Leu Asn Arg Gin Leu Ala Asp He Glu Lys Leu 

35 40 45 

Gin Val Asn Val Arg Gin Ala Leu Thr Leu Ala Asp Gin Ala Thr Ala 

50 55 60 

Ala Gly Asp Ala Ala Lys Ala Thr Glu Tyr Asn Asn Ala Ala Glu Ala 
65 70 75 80 

?he Ala Ala Gin Leu Val Thr Ala Glu Gin Ser Val Glu Asp Leu Lys 

35 90 95 

Thr Leu His Asp Gin Ala Leu Ser Ala Ala Ala Gin Ala Lys Lys Ala 

100 105 110 

Vai Glu Arg Asn Ala Met Val Leu Gin Gin Lys He Ala Glu Arg Thr 

115 120 125 

Lys Leu Leu Ser Gin Leu Glu Gin Ala Lys Met Gin Glu Gin Val Ser 

130 135 140 

Ala Ser Leu Arg Ser Met Ser Glu Leu Ala Ala Pro Gly Asn Thr Pro 
145 150 155 160 

Ser Leu Asp Glu Val Arg Asp Lys He Glu Arg Arg Tyr Ala Asn Ala 

165 170 175 

He Gly Ser Ala Glu Leu Ala Glu Ser 
180 185 



(2) INFORMATION FOR SEQ ID NO: 22 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 71 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 8: 

Val Ser Thr Ser Thr Trp Val Pro His Pro Val Arg Asp Arg Val He 

1 5 10 15 

Gly Gin Arg Trp Thr Cys Ala Asp Arg Arg Ser He Glu Glu Ser Thr 

20 25 30 

Glu Met Ala Phe Ser Val Gin Met Pro Ala Leu Gly Glu Ser Val Thr 

35 40 45 

Glu Gly Thr Val Thr Arg Trp Leu Lys Gin Glu Gly Asp Thr Val Glu 

50 55 60 

Leu Asp Glu Pro Leu Val Glu 
65 70 

12) INFORMATION FOR SEQ ID NO: 22 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 182 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 229: 

Glu Val His Leu Pro Val Asp Val Gly Glu Pro Arg Gin Pro Thr Gly 

1 5 10 15 

Ala Val Ala Asp Gin Asp His Arg He Thr Pro Val Pro Ala His Lys 

20 25 30 

His Thr Pro Pro Arg Val Cys Gin Asp Trp His Arg Gin Pro Pro His 

35 40 45 

Arg Gly Arg Ala Asp Gin His Leu Gly Leu Asp Ala Arg Leu Cys Ala 

50 55 60 

Ala Ala Cys Asn Val Leu Leu Val Asp Gly Val Gin His Arg Pro Gin 
55 70 75 80 

Arg His Gly Pro Gly Pro Arg Phe Gly Phe Pro Arg Val Val Val Ala 

85 90 95 

Cys Gly He Arg Gin Ala Arg Val Glu Val Glu Arg Phe Gly Gly Val 

100 105 110 

Leu Pro Glu Arg Ala His Gly Val Gly Gin Arg Asn Asn Arg Val Ala 

115 120 125 

Thr Asp Arg Leu Thr Asp Arg Met Pro He Asp Arg Gly Leu Gly Arg 

130 135 140 

Glu Pro Arg Ser Val Gly Gly Gin He Asp Arg Glu Arg Asp Gin Pro 
145 ISO 155 160 

Gin Arg He Pro Ala Gly Lys His Val Thr Pro His Cys Ser Gin Pro 

165 170 175 

Arg Ser Leu His Leu Val 
180 

(2) INFORMATION FOR SEQ ID NO: 230: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 160 ammo acids 

(B) TYPE: ammo acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:230: 



Asn 


Asp 


Arg 


Leu 


He 


Ser 


Met 


Arg 


Asp 


Gly Gly 


He 


Val 


Ala Leu Pro 


1 








5 










10 






15 


Gin 


Leu 


Thr 


Asp 


Glu 


Gin 


Arg 


Ala 


Ala 


Ala Leu 


Glu 


Lys 


Ala Ala Ala 








20 










25 








30 


Ala 


Arg Arg Ala 


Arg 


Ala 


Glu 


Leu 


Lys 


Asp Arg 


Leu 


Lys 


Arg Gly Gly 






35 










40 








45 




Thr 


Asn 


Leu 


Thr 


Gin 


Val 


Leu 


Lys 


Asp 


Ala Glu 


Ser Asp 


Glu Val Leu 




50 










55 








60 






Gly 


Lys 


Met 


Lys 


Val 


Ser 


Ala 


Leu 


Leu 


Glu Ala 


Leu 


Pro 


Lys Val Gly 


65 










70 








75 






80 


Lys 


Val 


Lys 


Ala 


Gin 


Glu 


He 


Met 


Thr 


Glu Leu 


Glu 


He 


Ala Pro His 










85 










90 






95 


Pro 


Ala 


Ala 


Phe 


Val 


Ala 


Ser 


Val 


Thr 


Val Ser 


Ala 


Arg 


Pro Cys Trp 








100 










105 








110 


Lys 


Ser 


Ser 


Ala 


Pro 


Pro 


Asn 


Pro 


Ala 


Gly Arg Arg 


Cys 


Gly Pro Glu 






115 










120 








125 




Gly 


Leu 


Trp 


Trp 


Ala 


Tyr 


Pro 


Arg 


lie 


Arg Gly Arg 


Ser 


Gly Leu Thr 




130 










135 








140 






Gly 


Pro 


Ala 


His 


Asn 


Ser 


Gly 


Arg 


Thr 


Pro Arg 


Trp 


Gly Gly Thr Arg 


145 










150 








155 






160 



(2) INFORMATION FOR SEQ ID NO: 231: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 8 atnmo acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(XI ) SEQUENCE DESCRIPTION: SEQ ID NO: 231: 



Asp 


Trp 


His 


Arg 


Gin 


Pro 


Pro 


His 


Arg 


Gly 


Arg 


Ala 


Asp 


Gin 


His 


Leu 










5 










10 










15 




Gly 


Leu 


Asp 


Ala 


Arg 


Leu 


Cys 


Ala 


Ala 


Ala 


Cys 


Asn 


Val 


Leu 


Leu 


val 








20 










25 










30 






Asp 


Gly 


Val 


Gin 


His 


Arg 


Pro 


Gin 


Arg 


His 


Gly 


Pro 


Gly 


Pro 


Arg 


Phe 






35 










40 










45 








Gly 


Phe 


Pro 


Arg 


Val 


Val 


Val 


Ala 


Cys 


Gly 


He 


Arg 


Gin 


Ala 


Arg 


Val 




50 










55 










60 








Glu 


Val 


Glu 


Arg 


Phe 


Gly 


Gly 


Val 


Val 


Pro 


Glu 


Arg 


Ala 


His 


Gly Val 


65 










70 










75 










30 


Gly 


Gin 


Arg 


Asn 


Asn 


Arg 


Val 


Ala 


Thr 


Asp 


Arg 


Leu 


Thr 


Asp 


Arg 


Met 










85 










90 










95 




Pro 


He 


Asp 


Arg 


Gly 


Leu 


Gly 


Arg 


Glu 


Pro 


Arg 


Ser 


Val 


Gly 


Gly 


Gin 








100 










105 










110 






T 1 


Asp 


Arg 


Giu 


Arg 


Asp 


Gin 


Pro 


Gin 


Arg 


He 


Pro 


Ala 


Gly 


Lys 


His 
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lis 

Val Thr Pro His 
130 

Ser Arg Arg His 
145 

Glu Val His Ala 
Pro Arg 



120 

Cys Pro Gin Pro 
135 

Val Glu Arg Gin 
150 

Gly Pro Leu Gly 
165 



Arg Ser Leu His 
14 0 

Arg His Arg Ala 
155 

Gly Ala Ser Gin 
170 



125 

Leu Val Leu Thr 

Glu Glu Gin His 
160 

Ser Gin Ala Ala 
175 



(2) INFORMATION FOR SEQ ID NO: 232: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 271 base pairs 
(3) TYPE: nucleic acid 

(C) 3TRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:232: 

ATGCCAAGCC GGTGCTGATG CCCGAGCTCG GCGAATCGGT GACCGAGGGG ACCGTCATTC 60 

GTTGGCTGAA GAAGATCGGG GATTCGGTTC AGGTTGACGA GCCACTCGTG GAGGTGTCCA 12 0 

CCGACAAGGT GGACACCGAG ATCCCGTCCC CGGTGGCTGG GGTCTTGGTC AGTATCAGCG 18 0 

CCGACGAGGA CGCCACGGTG CCCGTCGGCG GCGAGTTGGC CCGGATCGGT GTCGCTGCCG 24 0 

AGATCGGCGC CGCGCCCGCC CCCAAGCCCC C 271 

(2) INFORMATION FOR SEQ ID NO: 23 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 amino acids 

(B) TYPE: ammo acid 

(c; 3TRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: procein 



^Xi) SEQUENCE DESCRIPTION: SEQ ID ?JO:233: 



Ala 


Lys 


Pro 


Val 


Leu 


Met 


Pro 


Glu 


Leu 


Gly 


Glu 


Ser 


Val 


Thr 


Glu 


Gly 










5 










10 










15 




Thr 


Val 


He 


Arg 


Trp 


Leu 


Lys 


Lys 


He 


Gly 


Asp 


Ser 


Val 


Gin 


Val 


Asp 








20 










25 










30 






Glu 


Pro 


Leu 


Val 


Glu 


Val 


Ser 


Thr 


Asp 


Lys 


Val 


Asp 


Thr 


Glu 


He 


Pro 






35 










40 










45 








Ser 


Pro 


Val 


Ala 


Gly 


Val 


Leu 


Val 


Ser 


He 


Ser 


Ala 


Asp 


Glu 


Asp 


Ala 




50 










5 5 










60 






Thr 


Val 


Pro 


Val 


Gly 


Gly 


Glu 


Leu 


Ala 


Arg 


He 


Gly 


Val 


Ala 


Ala 


Glu 


65 










70 










75 








80 


He 


Gly 


Ala 


Ala 


Pro 


Ala 


Pro 


Lys 


Pro 

















85 

(2) INFORMATION ?0R SEQ ID NO: 234: 
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(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 107 base pairs 

(B) TYPE: nucleic acid 

{ C ) STRANDEDNESS : s ing 1 e 
(D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 234: 

GAGGTAGCGG ATGGCCGGAG GAGCACCCCA GGACCGCGCC CGAACCGCGG GTGCCGGTCA 
TCGATATGTG GGCACCGTTC GTTCCGTCCG CCGAGGTCAT TGACGAT 

(2) INFORMATION FOR SEQ ID NO: 235: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 5: 

ATGAAGTTGA AGTTTGCTCG CCTGAGTACT GCGATACTGG GTTGTGCAGC GGCGCTTGTG 

TTTCCTGCCT CGGTTGCCAG CGCAGATCCA CCTGACCCGC ATCAGCCGGA CATGACGAAA 

GGCTATTGCC CGGGTGGCCG ATGGGGTTTT GGCGACTTGG CCGTGTGCGA CGGCGAGAAG 

TACCCCGACG GCTCGTTTTG GCACCAGTGG ATGCAAACGT GGTTTACCGG CCCACAGTTT 

TACTTCGATT GTGTCAGCGG CGGTGAGCCC CTCCCCGGCC CGCCGCCACC GGGTGGTTGC 
GGTGGGGCAA TTCCGTCCGA GCAGCCCAAC GCTCCCTGA 

(2) INFORMATION FOR SEQ ID NO: 23 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 112 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: l::near 



(ii) MOLECULE TYPE: protein 





(xi) SEQUENCE 


DESCRIPTION 


SEQ ID 


NO: 


236: 










Met 


Lys 


Leu Lys 


Phe 


Ala 


Arg 


Leu 


Ser 


Thr 


Ala 


He 


Leu 


Gly 


Cys 


Ala 


1 


Ala 




5 










10 










15 




Ala 


Leu Val 


Phe 


Pro 


Ala 


Ser 


Val 


Ala 


Ser 


Ala 


Asp 


Pro 


Pro 


Asp 






20 










25 








30 




Pro 


Hxs 


Gin Pro 


Asp 


Met 


Thr 


Lys 


Gly 


Tyr 


Cys 


Pro 


Gly 


Gly Arg Trp 


Gly 


Phe 


35 








40 










45 








Gly Asp 


Leu 


Ala 


Val 


Cys 


Asp 


Gly 


Glu 


Lys 


Tyr 


Pro 


Asp 


Gly 




50 








55 










60 




Ser 


Phe 


Trp His 


Gin 


Trp 


Met 


Gin 


Thr 


Trp 


Phe 


Thr Gly 


Pro 


Gin 


Phe 


65 








70 










75 










80 




Phe 


.^p Cys 


Val 


Ser 


Gly 


Gly 


Glu 


Pro 


Leu 


Pro 


Gly 


Pro 


Pro 


Pro 



60 
12 0 
180 
240 
300 
339 
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(2) INFORMATION FOR SEQ ID NO: 23 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 317 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



60 
120 



85 90 95 

Pro Gly Gly Cys Gly Gly Ala He Pro Ser Glu Gin Pro Asn Ala Pro 

105 no 

(2) INFORMATION FOR SEQ ID NO:237: 

(i) SEQUENCE CHARACTERISTICS: 
(AJ LENGTH: 371 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 7: 

GTGACCACGG TGGGCCTGCC ACCAACCCGG GCAGCGGCAG CCGCGGCGGC GCCGGCGGCT 

CCGGCGGCAA CGGTGGCGCC GGGGGTAACG CCACCGGCTC AGGCGGCAAG GGCGGCGCCG i^u 

GTGGCAATGG CGGTGATGGG AGCTTCGGCG CTACCAGCGG CCCCGCCTCC ATCGGGGTCA 18 0 

CGGGCGCCCC CGGCGGCAAC GGCGGCAAGG GCGGCGCCGG TGGCAGCAAC CCCAACGGCT 24 0 

CAGGTGGCGA CGGCGGCAAA GGCGGCAACG GCGGTGCCGG CGGCAACGGG GGCTCGATCG 300 

GCGCCAACAG CGGCATCGTC GGCGGTTCCG GTGGGGCCGG TGGCGCTGGC GGCGCCGGCG 360 
GAAACGGCAG C 

371 

(2) INFORMATION FOR SEQ ID NO: 23 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 424 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 238: 

^^r^"^^^ CACCACCGCG CCGGCGCGCC CCTAGCGGCC GGGCGCACCA GCCCCTTTTC 60 

CAAGAAAAGG GCCTTCTGTT TGGTCGGCCA TGTTGGCATG ATCGTGACCC 120 

ATGGGCAACA TCGACGTCGA CATCTCGGCC .^GGTCTAGC TCCATGCGAA TCGCCGCCGC 180 

GGTGGTGAGC ATCGGTCTAG CCGTCATAGC AGGGTTCGCG GTACCTGTTG CCGACGCACA 240 

Cww^TCGGAG CCCGGGGTTG TGTCCTACGC GGTGCTCGGA AAGGGGTCGG TCGGCAACAT 3 00 

C..CGGCGCC CCAATGGGGT GGGAGGCGGT GTTCACCAAG CCGTTCCAGG CGTTITGGGT 3 60 

^GAACTACCG GCGTGCAACA ACTGGGTGGA CATCGGGCTG CCCGAGGTGT ACGACGATCC 420 
^uAC 



424 



(ii) MOLEOJLE TYPE: cDNA 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 9: 

GCGATGGCGG CCGCGGGTAC CACCGCCAAT GTGGAACGGT ITCCCAACCC CAACGATCCT SO 

TTGCATCTGG CGTCAATTGA CTTCAGCCCG GCCGATTTCG TCACCGAGGG CCACCGTC-A 120 

i^S^^^' CGATCCTACT GCGCCGTACC GACCGGCTGC CTTTCGCCGA gSSSt Ibo 

TGGGACTTGG TGGAGTCGCA GTTGCGCACG ACCGTCACCG CCGACACGGT GCGCATCGAC 240 
CgS^SaCO JSSxr^ ^'"'"^'"^ OCGGCGGCGT CCAAACTCAC CGAATCCCTG 



317 



60 
120 
180 
240 
300 
360 
420 
422 



(2) INFORMATION FOR SEQ ID NO: 240: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 422 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

Hi) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 240: 

TGGCGTATGC GCTTCGCAGC CGGTGCCGCG TCAACGCGCC GGAGGCAATC GCTTCGCTGC 
CGAGGAATGG TTCGATCACG ATCGCAGTGT GCCGTCGTGC ACCGACACCG cScSS 
J^^^?^^ OCGGAAAATC GGCCGAAATC TCGCCCTCAG TTCACGCTCG GCGcS^CG 
gSSS^P I;™'"' GCTTCTCGGC GAACGCGCGC GGGCCTTCCT TGGCGTCGTC 
GGACAGGAAG ACCTTGATGC CGATCTGGGT GTCGATCTTG AACGCCTCGT TITCGGGCAT 
^ej^!^^''" TCGCGGATGG ACCGCAAGAT GGCCTGCACG GCCAGGGGTC CGTTAGCCGA 
GATGGCG.CG GCAAGTTCTA GAACCTTGGT CAACGCCTGG CCGTCGGGCA CACGTGGCCG 

(2) INFORMATION FOR SEQ ID NO: 241: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 426 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO:241: 

?ScGTCG?^ S^SSS^ CCGCGGCTGC CAGATCTCCC GGACTCGGTA GTGCCGCCGG 
S^SSS r^^^I CGGGGCGCGG CGACCATAAG GTCGCTAATG CCCAGGTAGC 

S^CGgS^^ ^^rrTr ^^^^^^^^^ GACTCTCCAG CTCGCCGACC GGGAGCTTGG 180 

SgTGGC^S GcSS^Jn ^^^^^^^^^ ACAAGTCGAT CGAATGCATA GTGGCCTCCA 240 

AG?^?^r^r r^^J^^n ^^^^^"^^ CGGCAAATGC CTTGATTTCT AGCTCCGCGT 300 

AC^tSS^^ ScrrSS^ GGGATGAATG GGAACCGCAG GATGGCGACA AACGGGTCTG 360 

AC.TCAGGTT TGCCGCTTTG CGCACAGTGG TCGACAGCCG GTACTCGGCA TAAATGCTGG 42 0 

426 

(2) INFORMATION FOR SEQ ID NO: 2 42: 



60 
120 



(i) SEQUENCE CHARACTERISTICS 
(A) LENGTH: 327 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:242: 

AGACCGGCGA GGGTGTGGTC GCTGCCCGCG GCAriGTCGA TAATCTGCGC TGGCTCGACG 60 

CGCCGATCAA CTAGTGAGGC GCAACGCTAG GCTTTGGGAT ACCCACAGCT AAAAAGTriA 120 

TCAAAGAAAC GAAGAAGGTT GCCATGAGCA CTGTTGCCGC CTACGCCGCC ATGTCGGCGA 180 

CCGAACCCCT GACCAAGACC ACGATCACCC GTCGCGACCC GGGCCCGCAC GACATGGCGA 240 

TCGACATCAA ATTCGCCGGA ATCTGTCGCT CGGACATCCA TACCGTCCAA ACCGAATGGG 300 

GGCAACCGAA TTTACCTGTG GTCCCTG 327 

(2) INFORMATION FOR SEQ ID NO: 243: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 123 ammo acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:243: 

Asp His Gly Gly Pro Ala Thr Asn Pro Gly Ser Gly Ser Arg Gly Gly 

^5 10 15 

Ala Gly Gly Ser Gly Gly Asn Gly Gly Ma Gly Gly Asn Ala Thr Gly 

2° 25 30 

Ser Gly Gly Lys Gly Gly Ala Gly Gly Asn Gly Gly Asp Gly Ser Phe 

40 45 
Gly Ala Thr Ser Gly Pro Ala Ser He Gly Val Thr Gly Ala Pro Glv 

^° ^5 60 

Gly Asn Gly Gly Lys Gly Gly Ala Gly Gly Ser Asn Pro Asn Gly Ser 
= 5 "^0 75 30 

Gly Gly Asp Gly Gly Lys Gly Gly Asn Gly Gly Ala Gly Gly Asn Gly 

35 90 95 

Gly Ser He Gly Ala Asn Ser Gly He Val Gly Glv Ser Gly Gly Ala 

^00 105 no 

Gly Gly Ala Gly Gly .\la Glv Gly Asn Gly Ser 

120 

(2) INFORMATION FOR SEQ ID NO: 244: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 ammo acids 

(B) TYPE: amino acid 

(C) STRA2JDEDNESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: protein 

(Xi; SEQUENCE DESCRIPTION: SEQ ID NO:244: 
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Met Ala Ala Ala Gly Thr Thr Ala Asn Val Glu Arg Phe Pro Asn Pro 

^5 10 15 

Asn Asp Pro Leu His Leu Ala Ser He Asp Phe Ser Pro Ala Asp Phe 

2° 25 30 

val Thr Glu Gly His Arg Leu Arg Ala Asp Ala He Leu Leu Arg Arg 

^5 40 45 

Thr Asp Arg Leu Pro Phe Ala Glu Pro Pro Asp Trp Asp Leu Val Glu 

5^ 55 60 

Ser Gin Leu Arg Thr Thr Val Thr Ala Asp Thr Val Arg He Asp Val 

75 80 
He Ala Asp Asp Met Arg Pro Glu Leu Ala Ala Ala Ser Lys Leu Thr 

85 90 95 

Glu Ser Leu Arg Leu Tyr Asp Ser 
100 

(2) INFORMATION FOR SEQ ID NO: 24 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:245: 

Ala Tyr Ala Leu Arg Ser Arg Cys Arg Val Asn Ala Pro Glu Ala lie 

- 5 10 15 

Ala ser Leu Pro .Arg Asn Gly Ser He Thr He Ala Val Cys Arg Arg 

25 30 
Ala Pro Thr Pro Pro Ser Asn Val Asn 
35 40 

(2) INFORMATION FOR SEQ ID NO: 24 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 6: 
Val Pro Leu Asn Thr Ser Pro Arg Leu Pro Asp Leu Pro Asp Ser Val 
Val Pro Pro Val Ala Ser Leu Leu Ser 



^ 10 15 



20 25 
(2) INFORMATION FOR SEQ ID NO: 247 



(i) SEQUENCE CHARACTERISTICS 
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(A) LENGTH: 61 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:247: 

Met Ser Thr Val Ala Ala Tyr Ala Ala Met Ser Ala Thr Glu Pro Leu 

15 10 15 

Thr Lys Thr Thr lie Thr Arg Arg Asp Pro Gly Pro His Asp Met Ala 

20 25 30 

He Asp He Lys Phe Ala Gly He Cys Arg Ser Asp He His Thr Val 

35 40 45 

Gin Thr Glu Trp Gly Gin Pro Asn Leu Pro Val Val Pro 
50 55 60 

(2) INFORMATION FOR SEQ ID NO: 248: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

{ii} MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:248: 

GCTTGGAGCC CTGGAGCGAC GGTGTGGGTC TGGGGGTCGA TTCGTTCTCG GCGAAAGTCA 6 0 

ACTAAAGACC ACGTTGACAC CCAACCGGCG GCCCGGCATG GGCCGTCGCG GCGTAGAAGC 12 0 

TTTGACCGCG GCGCGAAACG TTCGCTGCTG CGGCCCATGC AGATCGCACA CGCTTGCTTG 180 

AACATCGGGT GGAGCCGGTG GTAACGCCAG GCT 213 

(2) INFORMATION FOR SEQ ID NO: 249: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 367 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:249: 

CCGAGCTGCT GTTCGGCGCC GGCGGTGCGG GCGGCGCGGG TGGGGCGGGC ACCGACGGCG 6 0 

GGCCCGGTGC TACCGGCGGG ACCGGCGGAC ACGGCGGAGT CGGCGGCGAC GGCGGATGGC 120 

TGGCACCCGG CGGGGCCGGC GGGGCCGGCG GGCAAGGCGG GGCAGGTGGT GCCCGCAGCG 180 

ATGGTGGCGC GTTGGGTGGT ACCGGCGGGA CGGGCGGTAC CGGCGGCGCC GGTGGCGCCG 24 0 

GCGGTCGCGG CAC^CTGCTG CTGGGCGCTG GCGGACAGGG CGGCCTCGGC GGGGCCGGCG 3 00 

GACAAGGCGG CACCGGCGGG GGCCGGCGGA GATGGCGTTC TGGGGGGTGT CAGTGGCACT 3 60 

GGTGGTA 3 67 
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(2) INFORMATION FOR SEQ ID NO: 250: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 420 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 250: 

AAGGCGTGAT TGGCAAGGCG ACCGCGCAGC GGCCCGTAGC CGCGGGACGG CCCAGGCCCC 60 

GACCGCAGCG GCCGGTGTCT OACCGGGTCA GCGACCAGCG GCGCTGACCG TGCCGCTCGT 120 

CTACTTCGAC GCCAGCGCCT TCGTCAAACT TCTCACCACC GAGACAGGGA GCTCGCTGGC 180 

GTCCGCTCTA TGGGACGGCT GCGACGCCGC ATTGTCCAAC CGCCTGGCCT ACCCCGAAGT 240 

CCGCGCCGCA CTCGCTGCAA CGGGCCGCAA TCACGACCTA ACCGAATCCG AGCTCGCC3A 3 00 

CGCCGAGCGT GACTGGGAGG ACTTCTGGGC CGCACCCGCC CAGTCGAACT CACCGCGACG 360 

GTTGAACAGC ACGCCGGGCA CCTCGCCCGA ACACATGCCT TACGCGGAGC CGACACCGTT 420 

(2) INFORMATION FOR SEQ ID NO: 251: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 299 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:251: 

CTCTTGTCGG TGGCATCGGC GGTACCGGCG GAACCGGCGG CAACGCCGGT ATGCTCGCCG 5 0 

GCGCCGCCGG GGCCGGCGGT GCCGGCGGGT TCAGCTTCAG CACTGCCGGT GGGGCTGGCG 12 0 

GCGCCGGCGG GGCCGGTGGG CTGTTCACCA CCGGCGGTGT CGGCGGCGCC GGTGGGCAGG 180 

GTCACACGGG CGGGGCGGGC GGCGCCGGCG GGGCCGGCGG GTTGTTTGGT GCCGGCGGCA 240 

TGGGCGGGGC GGGCGGATTC GGGGATCACG GAACGCTCGG CACCGGCGGG GCCGGCGGG 2 99 

(2) INFORMATION FOR SEQ ID NO: 252: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:252: 

Leu Glu Pro Trp Ser Asp Gly Val Gly Leu Gly Val Asp Ser Phe Ser 

15 10 15 

Ala Lys Val Asn 
20 
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C2) INFORMATION FOR SEQ ID NO: 253: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 121 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii] MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:253: 

Glu Leu Leu Phe Gly Ala Gly Gly Ala Gly Gly Ala Gly Gly Ala Gly 

Thr Asp Gly Gly Pro Gly Ala Thr Gly G^y Thr Gly Gly His Gly Glv 

25 30 
Val Gly Gly Asp Gly Gly Trp Leu Ala Pro Gly Gly Ala Gly Gly Ala 

40 



Gly Gly Gin Gly Gly .Ala Gly Gly Ala Arg Ser Asp Gly Gly Ala Leu 

Gly Gly Thr Gly Gly Thr Gly Gly Thr Gly Gly Ala Gly Gly Ala Gly 

Gly Arg Gly Thr Leu Leu Leu Gly Ala Gly Gly Gin Gly Gly Leu lly 

Gly Ala Gly Gly Gin Gly Gly Thr Gly Gly Gly Arg Arg Arg T^ Arg 

105 

Ser Gly Gly Cys Gin Trv His Trp Trp 

120 



(2) INFORMATION FOR SEQ ID NO: 254: 

1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 amino acids 

(B) TYPE: ammo acid 

iC) STRANDEDNESS: single 
;D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO-254- 
Gly val He Gly Lys Ala Thr Ala Gin Arg Pro Val Ala Ala Gly Arg 

Pro Arg Pro Arg Pro Gin Arg Pro Val Ser Asp Arg Val Ser Ifp Gin 

25 30 

Arg Arg 



(2) INFORMATION FOR SEQ ID NO: 255: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 9 amino acids 

(B) TYPE: ammo acid 

iC) STRANDEDNESS: single 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:255: 

Leu Val Gly Gly lie Gly Gly Thr Gly Gly Thr Gly Gly Asn Ala Gly 

Met Leu Ala Gly Ala Ala Gly Ala Gly Gly Ala Gly Gly Phe Ser Phe 

20 25 30 

Ser Thr Ala Gly Gly Ala Gly Gly Ala Gly Gly Ala Gly Gly Leu Phe 

35 40 45 

Thr Thr Gly Gly Val Gly Gly Ala Gly Gly Gin Gly His Thr Gly Gly 

50 55 60 

Ala Gly Gly Ala Gly Gly Ala Gly Gly Leu Phe Gly Ala Gly Gly Met 
^5 70 75 80 

Gly Gly Ala Gly Gly Phe Gly Asp His Gly Thr Leu Gly Thr Gly Gly 
95 90 95 

Ala Gly Gly 



(2) INFORMATION FOR SEQ ID NO: 256: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 282 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 256: 

TCCTGTTCGG CGCCGGCGGG GTGGGCGGTG TTGGCGGTGA CGGTGTGGCA TTCCTGGGCA 6 0 

CCGCCCCCGG CGGGCCCGGT GGTGCCGGCG GGGCCGGTGG GCTGTTCAGC GTCGGTGGGG 12 0 

CCGGCGGCGC CGGCGGAATC GGATTGGTCG GGAACAGCGG TGCCGGGGGG TCCGGCGGGT 180 

CCGCCCTGCT CTGGGGCGAC GGCGGTGCCG GCGGCGCGGG TGGGGTCGGG TCCACTACCG 24 0 

GCGGTGCCGG CGGGGCGGGC GGCAACGCCA GCCTGCTGGT AA 232 

(2) INFORMATION FOR SEQ ID NO: 25 7: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 415 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 257: 

GGGCACGAGC CGTGCTACTG GTCAACTGAT GCCCTGATTG TGACCTTCCC GGCGCCGGAT 60 

CAGTGCTTCT CAGGACCGAC GTAATATTCG AAAACCAATC CGGCCGCCGA GGCGAGGATG 12 0 

AATGCCACAC CGGCGGCGAT CAGCCACGGG AGCCACAACG CGATGCCGAC CGCTGCCACC 180 

GAGCCGGACA ACGCGACCAT GATCGGCCAC CAGCTATGCG GACTGAAGAA TCCAAGTTCT 240 

CCTGCGCCGT CGCTGATTTC AGCGCCTTCG TAGTCCTCGG GCCGGGAATC TAACCGGCGG 3Q0 

GCCACAAACC GGAAGAAGGT GGCGACGATC AACGCCATGC CGCCGGTGAG CGCCAACGCA 3 60 
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ATGGTGCCAG CCCACTCGAC ACCACCGGTG GCGAACATCG AGGTCAACAC GCCGT 



(2) INFORMATION FOR SEQ ID NO:258: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 373 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:258: 



TCACCGCGTG 


AACGGTTCGT 


AACACTGATA 


CGTATGCTTG 


TCAGCGAGCA 


GATCAAGTCC 


60 


AGTCCGACCA 


ATGCCAGGAG 


ATCATCGGCT 


AGGCTCACGG 


TTTCGCCTGG 


GACGAGACGG 


120 


TATTGAGTTC 


TGGCGTTGGA 


CGGTCCGTGG 


CGTGGTGGGA 


AGTCTGACGC 


GGCATCAGAA 


180 


CGGTTGTCAA 


TACCAGTCrr 


TGGGGGATAT 


GGCCTATTTG 


GTGTCGTCGG 


GCCGCTCCAC 


240 


CGGATCCCTT 


TTCGAACGTT 


GCGCAAGCGC 


GGTCCAGTTA 


CGGCCTGTTC 


ACTGCGCGCT 


300 


GGCGTAGCTG 


CGCGGCCTCG 


ATCGGTTTGA 


ACGTCATCGC 


AATTCCCGCA 


ATGGGTGAGT 


360 


ACCTGACGCT 


CCT 










373 



(2) INFORMATION FOR SEQ ID NO: 25 9: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 423 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 25 9: 

CCAAACCGGA CAGGCCGGCSi GCGACGGTCG GAAGTTGCAC CACGGTGCGC GCTCCATGTA 60 

GCCAACCGGT GACCACGGCG TAGACAGCAG ATCCGTGGAT CGCGCGTTCG GTGTCGTCCG 12 0 

GGCCGAGTAC CCGCGGGCCG AACCGCAGCG ACCAAAGCAA CGCGATCGAT ACGGGGATCG 180 

CCACTCGTGC CGAATTCGAG CTCCGTCGAC AAGCTTGCGG CCGCACTCGA ACCCGGGTGA 24 0 

ATGATTGAGT TTAAACCGCT TAGCAATAAC TAGCATAACC CCTTGGGGCC TCTAAACGGG 3 00 

TCTTGAGGGG TTTTTTGCTG AAAGGAGGAA CTATATCCGG ATAACCTGGC GTAGTAGCGA 3 60 

AGAGGCCCGC ACCGATCGCC CTTCCCAACA GTTGCGCAGC CTGAATGGCG AATGGACGCG 42 0 
CCC 



(2) INFORMATION FOR SEQ ID NO: 2 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 404 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



422 



(li) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ XD NO; 260: 
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(C) 



(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 262: 



gIc^Jg"S SS^^n^ CATCCTGCGT GCGATCTTCG GGGCCGGCGG CAGTGAACTA 
?-gIJ^cS ^r^r^J^l TCCGCCGTGG GTCACGCTGG GCTCGCGCCT GGCGGCGCTA 
cgcJ^taS ^SS^; TGGCCGCCTT AGCCCGTGGG GCCGGCTGGC CGAGTGGCGG 
GC^S?cS cS^^™ CGACGAGCTC ATCGAAGCCG AGCGGGCCGA CCCGAACTTC 
StcgS aS^c™ GGCGTTGATG CTGCGCAGCA CITACGACGA CGGTTCCATC 
aSgcSS i^r^rr^nr CTCACGCTGC rTGCCGCCGG GCACGAAACC 

^^^^^r^''^^ CATGGGCTGG GCGTTCGAAC GGCTCAACCG GCACCCCGAC GTGCTCGCGG 



(2) INFORMATION FOR SEQ ID NO: 2 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 522 base pairs 

(B) TYPE: nucleic acid 



60 
120 
180 
240 
300 
360 
404 



AGTGGCCAGC CGGTCGGCCA ATGCATCCAG CTCCCGGTAC GTCAGCTGAC CATCCGCCCA 
ACTGACCGCC ACCGAGTCAG GCTGTGCCGC AGCGATTTCG GCGAACCGG^ StgSccS 
GGGTGCCGAC GTCGTCACAT CCGGCAGGCC GGGTGCGGTC GGATCGTGCT CGCcScSg 

cagaatgtcg acgtcgcgca gcggccgatc ccaccggctg accaagSS SSSSc 
cagcacccgc ctgccgaggc tttcgggcgc catcgtgccc agcgcaccgt cg^SS?^ 

SSctSS J^^^CTCAC CG0TGCT.3CG GTGCGCGGCG ACGGTCACCG S^SgS 

caaactctct agcgccaccg gacggaacgt caccccgttt gcga 

(2) INFORMATION FOR SEQ ID NO: 261: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 421 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

{XI} SEQUENCE DESCRIPTION: SEQ ID NO: 261: 

^r^™^^ CAGGCTGTTC TTCGAACCCG CTGGCTAACT TCGCACCCGG GTATCCGCCC 
ACCATCGAAC CCGCCCAACC GGCGGTGTCA CCGCCTACTT CGCAAGACCC GGCCGGTpS 

^"S^'^^ =<^™ttc= aSTcSS S=cSI?S 

GxGGCTCTGC oCCCGGGCGC CGATTCGGCG GCACCCGCCA GCATCATGGT CTTCGATfflr 

gIcSSS ZT^c^ ^--^ S^SgcS 

GAC^CGGCA cggccttcct tgccgcccgc ggcggctact tcgtggccga cctgtcctc 

GGTCACACCG CACGAGTGAA TGTCGCTGAC GCAGCGCACA CCGATTTCAC CGCgItScC 420 

421 

(2) INFORMATION FOR SEQ ID NO: 262: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 6 base Oairs 

(B) TYPE: nucleic acid 
STRANDEDNESS: single 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
426 
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(2) INFORMATION FOR SEQ ID NO:2S5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 9 base Dairs 

(B) TYPE: aucleic acid 
iC) STRANDEDNESS : Single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



60 
12 0 
180 
240 
300 
360 
420 
480 
522 



(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:263: 

ann^i^^f CAGGCTGTTC TTCGAACCCG CTGGCTAACT TCGCACCCGG GTATCCGCCC 
ACCATCGAAC CCGCCCAACC GGCGGTGTCA CCGCCTACTT CGCAAGAC^ G^ISgcI 
r^r^^^^J' TOAGCGGCCA CCCCCGGGCG GCACTATTCG AcS^SSc ScSI^S 
GTGGCTCTGC GCCCGGGCGC CGATTCGGCG GCACCCGCCA GCATCATGGT CTTcS^Sc 
GTGCACGTTG CACCGCGCGT CATTTTTCTG CCGGGCCCGG CAGcScS SScSc 
GACCACGGCA CGGCCTTCCT TGCCGCCCGC GGCGGCTACT TCGtScSI S^S^cS^ 
GGTCACACCG CACGAGTGAA TGTCGCTGAC GCAGCGCACA CcSSS^ cSStCGc' 
ACGGCAAGCT GGTGCTGGGC AGCGCAGATG GCGCcSSa S^SSgCC 

aagaacccgc AGTTGACCGG CGTCGGCGCC gccaccgtag cc cacgcttgcc 

(2) INFORMATION FOR SEQ ID NO: 2 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 739 base pairs' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:264: 

ScS^'c' '^SS^^^^^ GCCCCTGGGC CCAGACCCCG CGCAAAACCA 60 

TCGgStc-G gSJ^C^^^^ CCGTCGTGCT CGTCCTCGTG TTGGGCGCCA 120 

AGCgSSIg ScCc'a^'g ^'gS^^^^ "Tt^''' '^^^^^^^ OrrGCGGAGG ISO 

-CATGCArr- rrr^^l.nZ CAGAAGTCAA CGCCGTGATG GGCTCGTCGT 240 

CgII^S GGgSSS' ^'I^^^^f"'' TGGACTCITC GCCGGTGACG GTGTCCCTGC 

cSS?^ Sgc^gJS^ ^ AGGATCCGGT GTATGCCGGC ACCGGCTACA 360 

AAGcSc^ CGCC^cS Ac'g^I^S SSn^S^ TXK..TGAACC 420 

£™ ™c ssss? sss 
~ ~ ~ 

gSSSSg tSS^GG -^^^"^^^^ ATCAAGCAGG CCAGATCGCC GCCAAGATCT 



660 
720 
739- 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 265 
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AGACGTCGTC GAGGCCGCCA TCGCCCGCGC CGAAGCCGTT AACCCGGCAC TGAACGCGTT 60 
GGCGTATGC 

69 

(2) INFORMATION FOR SEQ ID NO: 2 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 523 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 266: 

ACTGCACCCG GCAGGCGCGA CCAACGGATC GGGTCAACTA GCACTGCCGG TGGAGGCGC- 60 
CCCGCGGTCT GTGCCTTCCC ACGGGGAACC CTTGGGCAGC GCGGCTCCAG AAGGGTTGgI 12 0 
GGGAGAGTTC GACGACCGTA TCGACGAGCG GTTCCCGGTC TTCAGCTC3G CCAGTCTCGC 180 
CCGGGTCCGC TGACCCCGAT GACGCTGGAT GTCCAGTTGA GTGGACTGCG 240 
CGCGGCCGGT CGGGCGATGG GTCGGGTACT GGCGCTTGGC GGTGTCGTTG CCGATGAGTG 
GGAGAGAAGA GCCATCGCGG TGTTCGGTCA CCGCCCGTAT ATCGGAGTGT CGGCCAATAT 
TGTGGCCGCC GCCCAACTGC CGGGGTGGGA CGCGCAGGCC GTAACCCGGC GGGCACTGGG 
CGAGCAACCG CAGGTCACTG AGCTGCTTCC GTTTGGTCGA CCGCAACTTG CGGGCGGAC- 
GCTCGGCTCG GTCGCGAAGG TGGTCGTGAC GGCACGGTCG CTG 



(2) INFORMATION FOR SEQ ID NO:267: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 224 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; cDNA 

(Xi) SEQUENCE DESCRIPTION: 3EQ ID NO: 267: 

GTGTCGGTGT CGTCGGGGTA GGAGCGACTT CCCCGGCCGG CGCCGGCGCC GGAGCGGGC- 

CTGCAGGAAC CGGTGCCGGC GCCGGCGGCG GGGCGACCAA AGGCCGGATC GATTCGGCCA 

.CGCCTTGGC CGCGCCCTTG TCCACCGGGT TGTTGGCGGT CCCGAGCCAT ACCACAAAC- 

-^CGCTGAAG GGGCCCGGCG TCCGGTGCGT TCGCCGCGGG CGAC ^^^^AAAC. 

(2) INFORMATION FOR SEQ ID NO: 2 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 521 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:268: 
TGAACTGACT GCCCCGCTCG ATCGGCGGCG 3CGGC3TGTC ATAGCTGCGC CGCCAGGCCA 



300 
360 
420 
480 
523 
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60 
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TGAACTGCTC TTCGCCATAG CGGGCCTTGG TCTCGGCCTT GTCCAAACCC TGCAGCGCGC 

CGTAGTGGCG TrCGTTGAGC CGCCAGCTAC GCCGCACGGG AATCCAGAGC CGATCGGCGC 

TGTCCAACGC CAGATGCGCG GTGGTGATCG CGCGCCGCAG CAACGAGGTG TAGAGCACGT 

CGGGCAATAG GTCGTGTTCC GCGATCAGCT CGCCGCTTCG AACCGCCTCT GCCTGGCCCT 

TGTCCGTCAG GCCGACATCG ACCCAGCCGG TGAACAGGTT GAGGGCATTC CAGTCGCTCT 

CGCCGTGGCG CAGCAACACC AGGCTGCCAG TGTTTGCCAT ACCGGCAAGT CTCTCACGCA 

CTCCCGCACT CCTCATCGTG GACCAAAATG CCCGAATTCT CCTCGGTCCG CTGCGCAGCG 
CGTTCATACC GCCGAGGTGG TCGGCACCGT AACGGCCGGT T 

(2) INFORMATION FOR SEQ ID NO: 269: 

(i) SEQimiCE CHARACTERISTICS: 

(A) LENGTH: 426 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:269: 

CTCCAGGCTC ATTCGCTCGA ACAAAGCCAC CCGGCCGTAC AGCGGACGCC CCCATTCGTT 

GTCGTGATAG TCGCGGTACA GCTGGGCATC GGGCCCTGGA CGAACCTCCG CCCAGGGGCA 

GCGAACCAGC CCGTCGCCGC TCACGCGGGG TCAGAACGGT AGTGCACGAC AGTCTCGCCG 

CGCGAAGGGT TTGACGCGTC AGACTCGGCC TCGGCGTCTT CCGACGAGGC GTGGATCGCC 

CCGAGCTGAG AGCGTAGCGC CTCGAGCTCA CGGCCGAGCC GTTCCAGCAC CCAGTCCAC^ 

TCGCTGGTCT TGTTCCCGCG CAGCACCTGC GTGAACTTGA CCGCGTCGAC ATCGGCGCGG 

GTGACCCCGA ACGCCGGCAG CGTCGTCGCC GTCGTCGCCC GCGGCAGGGG CGGCAACTGC 
TCGCCA 

12) INFORMATION FOR SEQ ID NO: 270: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 219 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

{ii} MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:270: 

GCGGACACGG CGGACAAAGC GCAATCGGCC TCGGCGGCGG CGCCGGCGGC GACGGGGGCC 

AGGGCGGCGC CGGCCGCGGA CTGTGGGGTA CTGGCGGCGC CGGCGGACAC GGCGGGGCAA 

GGCoGTGGTA CCGGGGGCCC ACCGCTGCCC GGTCAGGCAG GCATGGGCGC CGCGGGTGGC 
GCCGGTGGGC TGATCGGC^A CGGCGGGGCC GGCGGCGAC 

(2) INFORMATION FOR SEQ ID NO: 2 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 571 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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180 
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(ii) MOLECULE TYPE: cDMA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 271: 

AAGATCATCG GCGCCGCTCC TTAGCATCGC TGCGCTCTGC ATCGTCGCCG GCGCGGATCA 60 

CGGAGGTCCG GCCTTGTACC CCACTCCTCG AACGGTCAGC ACCACAGTCG GGTTCTCGGG 120 

ATCCTTTTCG ACCTTGGCCC GCAGACGCTG GACATGCACG TTCACCAGCC TGGTATCGGC 180 

TGGGTGCCGG TAACCCCATA CCTGTTCGAG CAGCACATCA CGAGTAAACA CCTGGCGCGG 240 

CTTGCGCGCC AATGCGACCA ACAGGTCGAA TTCCAGCGGT GTCAACGAGA TCTGCTCACC 3 00 

GTTGCGAGTG ACCTTGTGCG CCGGTACGTC GATTTCTACG TCGGCGATGG ACAGCATCTC 360 

GGCGGGTTCG TCGTCGTTGC GGCGCAGCCG CGCCCGCACC CGCGCAACCA GCTCCTTGGG 420 

CTTGAACGGC TTCATGATGT AGTCGTCGGC GCCCGACTCC AGACCCAGCA CCACATCCAC 480 

GGTGTCGGTC TTTGCGGTGA GCATCACGAT CGGAACACCG GAATCGGCGC GCAACACCCG 540 

GCACACGTCG ATGCCGTTCA TACCGGGGCA A 571 

(2) INFORMATION FOR SEQ ID NO: 272: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 amino acids 

(B) TYPE: amino acid 

(C} STRANDEDNESS : single 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 272: 



Leu 


Phe 


Gly Ala 


Gly 


Gly 


Val 


Gly 


Gly 


Val Gly 


Gly Asp Gly 


Val 


Ala 


1 






5 










10 




15 




Phe 


Leu 


Gly Thr 


Ala 


Pro 


Gly 


Gly 


Pro 


Gly Gly 


Ala Gly Gly Ala 


Gly 






20 










25 




30 






Gly 


Leu 


Phe Ser 


Val 


Gly 


Gly 


Ala 


Gly 


Gly Ala 


Gly Gly lie 


Gly 


Leu 






35 








40 






45 




Val 


Gly 


Asn Ser 


Gly 


Ala 


Gly 


Gly 


Ser 


Gly Gly 


Ser Ala Leu 


Leu 


Trp 




50 








55 








60 




Gly 


Asp 


Gly Gly 


Ala 


Gly 


Gly 


-Ala 


Gly 


Gly Val 


Gly Ser Thr 


Thr 


Gly 


55 








70 








75 






80 


Gly 


Ala 


Gly Gly 


Ala 
85 


Gly 


Gly 


Asn 


Ala 


Ser Leu 
90 


Leu Val 







(2) INFORMATION FOR SEQ ID NO: 273: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 273: 

Met Pro Pro Val Ser Ala Asn Ala Met Val Pro Ala His Ser Thr Pro 

15 10 15 

Pro Val Ala Asn lie Glu Val Asn Thr Pro 
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20 25 
(2) INFORMATION FOR SEQ ID NO: 2 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:274: 
Lys Pro Asp Arg Pro Ala Ala Thr Val Gly Ser Cys Thr Thr Val Arg 



10 



15 



Ala Pro Cys Ser Gin Pro Val Thr '^hr Ala 
20 25 

(2) INFORMATION FOR SEQ ID NO: 275: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:275: 
Trp Pro Ala Gly Arg Pro Met His Pro Ala Pro Gly Thr Ser Ala Asp 

His Pro Pro Asn 
20 

(2) INFORMATION FOR SEQ ID NO: 2 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO:276: 

val Leu val Ala Gly Cys Ser Ser Asn Pro Leu Ala Asn Phe Ala Pro 

Gly Tyr Pro Pro Thr He Glu Pro Ala gL Pro Ala Val Ser Pro Pro 

5 2 0 

Thr ser Gin Asp Pre Ala Gly Ala Val Arg Pro Leu Ser G^y His Pro 

4 0 

Arg Ala Ala Leu Phe Asp Asn Gly Thr Arg Gin Leu 111 Ala Leu Arg 
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50 

Pro Gly Ala Asp 
65 

Met His Val Ala 

Leu Tkr Ser Asp 
100 

Tyr Phe Val Ala 
115 

Ala Asp Ala Ala 
130 



55 

Ser Ala Ala Pro 
70 

Pro Arg Val lie 
85 

Asp His Gly Thr 

Asp Leu Ser Ser 
120 

His Thr Asp Phe 
135 



60 

Ala Ser lie Met 
75 

Phe Leu Pro Gly 
90 

Ala Phe Leu Ala 
105 

Gly His Thr Ala 

Thr Ala lie Ala 
140 



Val Phe Asp Asp 
80 

Pro Ala Ala Ala 
95 

Ala Arg Gly Gly 
110 

Arg Val Asn Val 
125 



(2) INFORMATION FOR SEQ ID NO: 2 77: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 142 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 277: 



Met His lie Thr 

Gly Ser Glu Leu 
20 

Leu Gly Ser Arg 
35 

Arg Leu Ser Pro 

50 

Thr Val lie Asp 

6 5 

Ala Asp Arg Thr 

Asp Gly Ser lie 
100 

Leu Leu Ala Ala 

115 

Ser PiSn Gly Ser 
130 



Leu Asn Ala lie 
5 

Asp Glu Leu Arg 

Leu Ala Ala Leu 
40 

Trp Gly Arg Leu 
55 

Glu Leu lie Glu 
70 

Asp Val Leu Ala 
85 

Met Ser Arg Lys 

Gly His Glu Thr 
120 

Thr Gly Thr Pro 
135 



Leu Arg Ala lie 
10 

Arg Leu He Pro 

25 

Pro Lys Pro Lys 

Ala Glu Trp Arg 
60 

Ala Glu Arg Ala 
75 

Leu Met Leu Arg 
90 

Asp He Gly Asp 
105 

Thr Ala Ala Thr 

Thr Cys Ser Arg 

14 0 



Phe Gly Ala Gly 
15 

Pro Trp Val Thr 
30 

Arg Asp Tyr Gly 
45 

Arg Gin Tyr Asp 

Asp Pro Asn Phe 

SO 

Ser Thr Tyr Asp 
95 

Glu Leu Leu Thr 
110 

Trp Ala Gly Arg 
125 

Leu Trp 



(2) INFORMATION FOR SEQ IB NO: 27 8: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 163 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



(XI ) SEQUENCE DESCRIPTION: SEQ ID NO: 27 8 
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Val Leu Val Ala 
1 

Gly Tyr Pro Pro 
20 

Thr Ser Gin Asp 
35 

Arg Ala Ala Leu 

50 

Pro Gly Ala Asp 
65 

Val His Val Ala 

Leu Thr Ser Asp 
100 

Tyr Phe Val Ala 
115 

Ala Asp Ala Ala 

13 0 

Gly Lys Leu Val 
14 5 

Lys Asn Pro 



Gly Cys Ser Ser 
5 

Thr He Glu Pro 

Pro Ala Gly Ala 
40 

Phe Asp Asn Gly 
55 

Ser Ala Ala Pro 
70 

Pro Arg Val He 
85 

Asp His Gly Thr 

Asp Leu Ser Ser 
120 

His Thr Asp Phe 

135 

Leu Gly Ser Ala 
150 



Asn Pro Leu Ala 

10 

Ala Gin Pro Ala 
25 

Val Arg Pro Leu 

Thr Arg Gin Leu 
60 

Ala Ser He Met 

75 

Phe Leu Pro Gly 
90 

Ala Phe Leu Ala 

105 

Gly His Thr Ala 

Thr Ala He Ala 
140 

Asp Gly Ala Val 
155 



Asn Phe Ala Pro 
15 

Val Ser Pro Pro 
30 

Ser Gly His Pro 
45 

Val Ala Leu Arg 

Val Phe Asp Asp 
80 

Pro Ala Ala Ala 
95 

Ala Arg Gly Gly 
110 

Arg Val Asn Val 
125 

Arg Arg Ser Asp 

Tyr Thr Leu Ala 
160 



(2) INFORMATION FOR SEQ ID KG: 279: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

iii) MOLECULE TYPE: protein 

^xi) SEQUENCE DESCRIPTION: SEQ ID NO: 279: 

Trp 31y Ala Pro Pro Ser Gly Gly Pro Ser Pro Trp Ala Gin Thr Pro 

5 10 15 

Arg Lys Thr Asn Pro Trp Pro Leu Val Ala Gly Ala Ala Ala Val Val 

20 25 30 

Leu Val Leu Val Leu Gly Ala He Gly He Trp He Ala He Arg Pro 

35 40 45 

Lys Pro Val Gin Pro Pro Gin Pro Val Ala Glu Glu Arg Leu Ser Ala 

50 55 60 

Leu Leu Leu Asn Ser Ser Glu Val Asn Ala Val Met Gly Ser Ser Ser 
65 70 75 80 

Meu Gin Pro Gly Lys Pro He Thr Ser Met Asp Ser Ser Pro Val Thr 

as 90 95 

Val Ser Leu Pro Asp Cys Gin Gly Ala Leu Tyr Thr Ser Gin Asp Pro 

100 105 110 

Val Tyr Ala Gly Thr Gly Tyr Thr Ala He Asn Gly Leu He Ser Ser 

115 120 125 

Glu Pro Gly Asp Asn Tyr Glu Hxs Trp Val Asn Gin Ala Val Val Ala 

130 135 140 

Phe Pro Thr Ala Asp Lys Ala Arg Ala Phe Val Gin Thr Ser Ala Asp 
145 150 155 160 
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Lys Trp Lys Asn Cys 
165 

Lys Thr Tyr Arg Trp 
180 

lie Thr Val He Asp 
195 

Axg Ala Met Ser Val 
210 

Gly Tyr Gin He Thr 
225 



Ala Gly Lys Thr Val 
170 

Thr Phe Ala Asp Val 
185 

Thr Gin Glu Gly Ala 
200 

Ala Asn Asn Val Val 

215 

Asn Gin Ala Gly Gin 
230 



Thr Val Thr Asn Lys Ala 
175 

Lys Gly Ser Pro Pro Thr 
190 

Glu Gly Trp Glu Cys Gin 
205 

Val Asp Val Asn Ala Cys 
220 

He Ala Ala Lys He Cys 
235 240 



(2) INFORMATION FOR SEQ ID NO: 280: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 80: 



Asp Val Val Glu Ala Ala He Ala Arg Ala Glu Ala Val Asn Pro Ala 

15 10 15 

Leu Asn Ala Leu Ala Tyr 



(2) INFORMATION FOR SEQ ID NO: 281: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 174 amino acids 
iB) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 281: 



Leu His Pro Ala 

Val Glu Ala Pro 
20 

Ser Ala Ala Pro 
35 

Glu Arg Phe Pro 
50 

Gly Pro Leu Thr 
65 

Ala Ala Gly Arg 

Ala Asp Glu Trp 
100 

Tyr lie Gly Val 



Gly Ala Thr Asn 
5 

Pro Arg Ser Val 

Glu Gly Leu Glu 
40 

Val Phe Ser Ser 
55 

Pro Met Thr Leu 
70 

Ala Met Gly Arg 
85 

Glu Arg Arg Ala 
Ser Ala Asn He 



Gly Ser Gly Gin 
10 

Pro Ser His Gly 
25 

Gly Glu Phe Asp 

Ala Ser Leu Ala 
60 

Asp Val Gin Leu 
75 

Val Leu Ala Leu 
90 

He Ala Val Phe 
105 

Val Ala Ala Ala 



Leu Ala Leu Pro 
15 

Glu Pro Leu Gly 
30 

Asp Arg He Asp 
45 

Glu Ala Leu Pro 

Ser Gly Leu Arg 
80 

Gly Gly Val Val 
95 

Gly His Arg Pro 
110 

Gin Leu Pro Gly 
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115 120 
Trp Asp Ala Gin Ala Val Thr Arg 

130 135 
Val Thr Glu Leu Leu Pro Phe Gly 
145 150 
Leu Gly Ser Val Ala Lys Val Val 
165 



125 

Arg Ala Leu Gly Glu Gin Pro Gin 
140 

Arg Pro Gin Leu Ala Gly Gly Pro 
155 160 
Val Thr Ala Arg Ser Leu 
170 



(2) INFORMATION FOR SEQ ID NO: 282: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO:282: 



Val Gly Val Val Gly Val Gly Ala Thr Ser Pro Ala Gly Ala Gly Ala 

15 10 15 

Gly Ala Gly Ser Ala Gly Thr Gly Ala Gly Ala Gly Gly Gly Ala Thr 

20 25 30 

Lys Gly Arg lie Asp Ser Ala Ser Ala Leu Ala Ala Pro Leu Ser Thr 

35 40 45 

Gly Leu Leu Ala Val Pro Ser His Thr Thr Asn Gin Arg 
50 55 60 



(2) INFORMATION FOR SEQ ID NO: 2 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 3 amino acids 

(B) TYPE: amino acid 

(C3 STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(ii) MOLECJLE TYPE: protein 

(Xl) SEQUENCE DESCRIPTION; SEQ ID NO:283 



Met Ala Asn Thr Gly Ser Leu Val Leu Leu Arg His Gly Glu Ser Asp 

15 10 15 

Trp Asn Ala Leu Asn Leu Phe Thr Gly Trp Val Asp Val Gly Leu Thr 

20 25 30 

Asp Lys Gly Gin Ala Glu Ala Val Arg Ser Gly Glu Leu He Ala Glu 

35 40 45 

His Asp Leu Leu Pro Asp Val Leu Tyr Thr Ser Leu Leu Arg Arg Ala 

50 55 60 

lie Thr Thr Ala His Leu Ala Leu Asp Ser Ala Asp Arg Leu Trp He 
65 70 75 80 

Pro Val Arg Arg Ser Trp Arg Leu Asn Glu Arg His Tyr Gly Ala Leu 

as 90 95 

Gin Gly Leu Asp Lys Ala Glu Thr Lys Ala Arg Tyr Gly Glu Glu Gin 
IOC 105 110 
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Phe Met Ala Trp Arg Arg Ser lyr Asp Thr Pro Pro Pro Pro He Glu 

115 120 125 

Arg Gly Ser Gin Phe 
130 

(2) INFORMATION FOR S^Q ID NO: 2 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 aniino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:284: 

Pro Gly Ser Phe Ala Arg Thr Lys Pro Pro Gly Arg Thr Ala Asp Ala 

15 10 15' 

Pro He Arg Cys Arg Asp Ser Arg Gly Thr Ala Gly His Arg Ala Leu 

20 25 30 

Asp Glu Pro Pro Pro Arg Gly Ser Glu Pro Ala Arg Arg Arg Ser Arg 

35 40 45 

Gly Val Arg Thr Val Val His Asp Ser Leu Ala Ala Arg Arg Val 
50 55 60 

(2) INFORMATION FOR SEQ ID NO: 2 85: 

(i) SEQUENCE CHARACTERISTICS: 
{A} LENGTH: 72 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 





(ii) HOLECUhZ 


TYPE: protein 
















ixi) SEQUENCE 


DESCRIPTION 


SEC 


ID 


NO:285 : 








Gly 


His Gly Gly Gin 


Ser Ala lie 


Gly 


Leu 


Gly 


Gly 


Gly 


Ala 


Gly Gly 




5 






10 










15 


Asp 


Gly Gly Gin Gly 


Gly Ala Gly 


Arg 


Gly 


Leu 


Trp 


Gly 


Thr Gly Gly 




20 




25 










30 




Ala 


Gly Gly His Gly 


Gly Ala Arg 


Arg 


Trp 


Tyr 


Arg 


Gly 


Pro 


Thr Ala 




35 


40 










45 






Ala 


Arg Ser Gly Arg 


His Gly Arg 


Arg 


Gly Trp 


Arg 


Arg 


Trp 


Ala Asp 




50 


55 








60 








Arg 


Gin Arg Arg Gly 


Arg Arg Arg 

















65 70 

(2) INFORMATION FOR SEQ ID NO: 2 86: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 74 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 





(ii) MOLECULE 


TYPE: protein 


















(xi) SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 


286: 










Asp 


His Arg Arg Arg 


Ser Leu Ala 


Ser 


Leu 


Arg 


Ser 


Ala 


Ser 


Ser 


Pro 


I 


5 






10 










15 




Ala 


Arg He Thr Glu 


Val Arg Pro 


Cys 


Thr 


Pro 


Leu 


Leu 


Glu 


Arg 


Ser 




20 




25 










30 




Ala 


Pro Gin Ser Gly 


Ser Arg Asp 


Pro 


Phe 


Arg 


Pro 


Trp 


Pro 


Ala 


Asp 




35 


40 










45 






Ala 


Gly His Ala Arg 


Ser Pro Ala 


Trp 


Tyr Arg 


Leu 


Gly Ala 


Gly Asn 




50 


55 








60 










Pro 


He Pro Val Arg 


Ala Ala His 


His 


Glu 














55 




70 



















{2} INFORMATION FOR SEQ ID NO; 287: 

(i) SEQUENCE CHAJEIACTSRISTICS : 

(A) LENGTH: 174 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:287: 

CCGCACGTAA CACCGTGAAT TGAAGGGAGC CGCTGGTCAT GGGCCGATTC TATCCGTGGG 
CGAACGGTTA TTGACGGCCC GGAGGCCACT CCGCTGCCAC CAAGTGGTGA CTCAGCGCGT 
TTTCACGGCA ACGAACGGCG GACACACCAC TTGACATTCG ACAGCACGGC CGCG 

(2) INFORMATION FOR SEQ ID N0:288: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 04 base pairs 

(B) TYPE: nucleic acid 

(C) 3TRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:288: 

TCGCAAACGG GGTGACGTTC CGTCCGGTGG CGCTAGAGAG TTTGTCGCAC TTTCCGGTGA 
CCGTCGCCGC GCACCGCAGC ACCGGTGAGC TCACGCTGCT AGTGGAGGTG CTCGACGGTG 
GGCTGGGCAC GATGGCGCCC GAAAGCCTCG GCAGGCGGGT GCTGGCTGTG TTACAGCGCT 
TGGTCAGCCG GTGGGATCGG CCGCTGCGCG ACGTCGACAT TCTGCTGGAC GGCGAGCACG 
ATCCGACCGC ACCCGGCCTG CCGGATGTGA CGACGTCGGC ACCCGCGGTG CATACCCGGT 
TCGCCGAAAT CGCTGCGGCA CAGCCTGACT CGGTGGCGGT CAGTTGGGCG GATGGTCAGC 
TGACGTACCG GGAGCTGGAT GCATTGGCCG ACCGGCTGGC CACT 



60 
120 
180 
240 
300 
360 
404 



(2) INFORMATION FOR SEQ ID NO: 289 
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(i) SEQUENCE CHARACTERISTICS: 
{A} LENGTH: 134 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 





(ii) MOLECULE 


TYPE: protein 










(Xi) SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:289: 




Ala 


Asn Gly Val 


Thr 


Phe 


Arg 


Pro 


Val 


Ala 


Leu Glu Ser Leu Ser 


His 


1 




5 










10 


15 




Phe 


Pro Val Thr 


Val 


Ala 


Ala 


His 


Arg 


Ser 


Thr Gly Glu Leu Thr 


Leu 




20 










25 




30 




Leu 


Val Glu Val 


Leu 


Asp 


Gly 


Ala 


Leu 


Gly Thr Met Ala Pro Glu 


Ser 




35 








40 






45 




Leu Gly Arg Arg 


Val 


Leu 


Ala 


Val 


Leu 


Gin 


Arg Leu Val Ser Arg 


Trp 




50 






55 








60 


Asp 


Arg Pro Leu 


Arg 


Asp 


Val 


Asp 


He 


Leu 


Leu Asp Gly Glu Eis 


Asp 


6S 






70 










75 


80 


Pro 


Thr Ala Pro 


Gly 


Leu 


Pro 


Asp Val 


Thr 


Thr Ser Ala Pro Ala 


Val 






85 










90 


95 




His 


Thr Arg Phe 


Ala 


Glu 


He 


Ala 


Ala 


Ala 


Gin Pro Asp Ser Val 


Ala 




100 










105 




110 




Val 


Ser Trp Ala 


Asp 


Gly 


Gin 


Leu 


Thr 


Tyr Arg Glu Leu Asp Ala 


Leu 




115 








120 






125 




Ala 


Asp Arg Leu 


Ala 


Thr 















(2) INFORMATION FOR SEQ ID NO: 2 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 526 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(li) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: 

GCTTCGACGG CTACGAGTAC CTGTTCTGGG 
CCAAGAAGAC CACCAAGGCC GTCGCCGAGC 
TGCTGGGCGC TGGGGAAACC TGCAACGGCG 
TCTTCCAGC\ GCTGGCACAA CAGGCCGTCG 
AGACCGTCGA CCGCAAGATC GTTGTCACCT 
AATATCGGCA GCTGGGCGCC AACTACACCG 
TGGTGCGCGA CAAGAGGCTG GTCCCTGTCA 
ACCCGTGCTA CCTGGGTCGG CACAACAAGG 
CCGCGGGGGC CACCTGAGCC GAGATGCCGC 

(2) INFORMATION FOR SEQ 



SEQ ID NO: 2 90: 



TGGGTTGTGC 


GGGCGCCTAC 


GACGACAAGG 


60 


TGTTCGCCGT 


CGCCGGGGTG 


AAATACTTGG 


120 


ACTCGGCGCG 


CCGCTCCGGC 


AACGAGTTCC 


180 


AGACCCTGGA 


CGGTTTGTTC 


GAGGGTGTGG 


240 


GCCCGCACTG 


CTTCAACACC 


ATCGGCAAGG 


300 


TGCTGCACCA 


CACCCAGCTG 


CTCAATCGGT 


360 


CTCCGGTTTC 


TCAGGACATC 


ACCTACCACG 


420 


TCTACGAGGC 


ACCACGGGAG 


CTGATCGGTG 


480 


GCCATGCCGA 


CCGCAG 




526 



ID NO: 291 : 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 487 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 91: 

CTCGCCGCCG TGATCTGGCC GGCGAACTTC GTCAGTGCAT CCAGACCCCA ACGATCATCG 
rl^^T!" ^^^'^^^^^^ CACCGCACCG GCCACCAGCA CCGCgSSJ GcS^SS^ 
rrr^r^^. CCCGGGTGAG TGCCGGAAGC TGGGAGGCAA GAAAGAcSc SSSSJ 
CCCAGGAACA TCGCCAACCC ACCCATCCGA GGGGTAGGCG TGACGTgSc SS^SJcc 
CGCGGGTAGG CGACGGCTCC CAGGCGACTG GCCAGCATCC GCACCGgSc ^SgSS 

taggtgatga tcgccgcggt cagcccgacc agcgcaagct cacgcS^ 

AGAC™ gSc™ ^11?''^^ ™- ™CG 

AAT?S GAAGAGCTGA ACACTCGCCG AACGTGCAAC AGCTGCGAAC 480 

487 

(2) INFORMATION FOR SEQ ID NO: 292: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 528 base pairs' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:292: 



60 
120 
180 
240 
300 
360 
420 



60 



180 
240 
300 
360 
420 



''S^?"'? G?^Ic'S^ 55f '^^^^ CCGGCATX.TA CGAGCTTGAG TTCCCGGCGC 
::SV^f:::^ GTC^xCCGAC GGCCGTGGTC CGGTGTTGGT GCACGCTTTG GAAGGTT-r- -^n 

ccatgcgatc cggctggccg ccgcccacct caaggSgc? ^gaSS 

TGCGC^S^ S?cS?S' C^™: ro:.OCT^CC CTGTATGCGC 

S^S^S ™- ~cc 

— — ™- -ScTt? ™- ~- 

(2) INFORMATION FOR SEQ ID NO: 293: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 610 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 2 93: 



CG^G?'^^^^ g'Sg^'c^^^^ CcL';T ^^^^^^^^-^ GCCGCCGACG CCGGCGriGC 60 

^.uuL^v^v:, ^^^CGCCGTC ACCGGCTTTG CCGCCATCGC 180 
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CGCCGTTGCC GCCGCTGGTG GGGGTGGCGG CCTGGTTGAC GTATTGTTCC ACCGGCCCGG 
CCCTTGACCC TTTGGCGGTG TCGATCGCGG CGTCGATGGA TCCGCCGACC ACGACGTGCG 
AAGCCTCGCC TGCCGCCGCA GCCGCCCAAC TGTGTCGCGG CTCCTGCGAT TTGGCCCCGG 
CCGACGAGAT GATGGGCACC ACCGGAGCCT GCGGCCGTCT GGGGGAGGCC AGCGCGGGTT 
CGCGGTCACG CCATACGCGA CGGTGCGCCG CCGCTTCGGA GATITGCAGG CTGCGTTGCA 
CCAGATCGAG CAGCGGTGTG CCCAGGGACT GGGTTAGCCC GTTGGCGCCG CCGTTGTAGC 
GGCGAGCGCA ATA7CGGTCC CCACTCGACC CAACCGCGAC TCCATAAGCG ACACCATTCG 
CGGTTGATGC 

(2) INFORMATION FOR SEQ ID NO: 294: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 164 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 94: 

Phe Asp Gly Tyr Glu Tyr Leu Phe Trp Val Gly Cys Ala Gly Ala Tyr 

^ 10 15 

Asp Asp Lys Ala Lys Lys Thr Thr Lys Ala Val Ala Glu Leu Phe Ala 

25 30 
val Ala Gly Val Lys Tyr Leu Val Leu Gly Ala Glv Glu Thr Cys Asn 

^5 40 45 

Gly Asp Ser Ala Arg Arg Ser Gly Asn Glu Phe Leu Phe Gin Gin Leu 

55 60 
Ala Gin Gin Ala Val Glu Thr Leu Asp Glv Leu Phe Glu Gly Val Glu 
o5 70 ^2 

Thr val Asp Arg Lys He Val Val Thr Cys Pro His Cvs Phe Asn ^hr 
35 90 95 

Gly Lys Glu Tyr Arg Gin Leu Gly Ala Asn Tyr Thr Val Leu H^s 

■ 105 

rlis Thr Gin Leu Leu Asn Arg Leu Val Arg Asp Lys Arg Leu Val Pro 

Val Thr Pro Val Ser Gin Asp lie Thr Tyr His Asp Pro Cvs Tyr Leu 

, 135 
G.y .^g His Asn Lys Val Tyr Glu Ala Pro Arg Glu Leu He Gly Ala 

155 ISO 
Ala Gly Ala Thr ^ 



240 
300 
360 
420 
480 
540 
600 
610 



(2) INFORMATION FOR SEQ ID NO: 2 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 161 amino acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(XI) SEQL^CE DESCRIPTION: SEQ ID NO: 2 95: 
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Arg Arg Arg Asp Leu Ala Gly Glu Leu Arg Gin Cys He Gin Thr Pro 

^5 10 15 

Thr lie He Asp Gin Ala Asp Ala His Asp His Arg Thr Gly His Gin 

20 2S 30 

His Arg Gly His Ala Gly Gly He Asp Glu Pro Pro Gly Glu Cys Arg 

35 40 45 

Lys Leu Gly Gly Lys Lys Asp Gly Ala Asp Asn Ala Gin Glu His Arg 

5° 55 60 

Gin Pro Thr His Pro Arg Gly Arg Arg Asp Val His He Ser Leu Pro 

^5 75 gp 

Arg Val Gly Asp Gly Ser Gin Ala Thr Gly Gin His Pro His Arg Thr 

85 90 95 

Gly Arg Lys He Gly Asp Asp Arg Arg Gly Gin Pro Asp Gin Arg Lys 

105 110 
Leu Thr Gin Arg Asp Thr Gly Ala Ala He Gly Gin Gly Glu Gin Ala 

lis 12 0 125 

Thr Gly Asn Ala Gly His lie Ala Gly His Leu Glu Thr Val Leu His 

130 135 140 

Gin Pro Glu Glu Leu Asn Thr Arg Arg Thr Cys Asn Ser Cys Glu Gin 

l-^S 150 ici- 

-Liju 

Leu 



(2) INFORMATION FOR SEQ ID NO: 2 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175 amino acids 

(B) TYPE: amino acid 

(C) STHANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(XI J SEQUENCE DESCRIPTION: SEQ ID 1^0:296: 



Glu 


Ala 


Arg 


Glu 


Tyr 


Glu 


Pro 


Gly 


Gin 


Pro 


Gly 










5 










10 




?he 


Pro 


Ala 


Pro 


Gin 


Leu 


Ser 


Ser 


Ser 


Asp 


Gly 








20 










25 




Val 


His 


Ala 


Leu 


Glu 


Gly 


Phe 


Ser 


Asp 


Ala 


Gly 






35 










40 






Ala 


Ala 


Ala 


His 


Leu 


Lys 


Ala 


Ala 


Leu 


Asp 


Thr 




50 










55 








Phe 


Ala 


He 


Asp 


Glu 


Leu 


Leu 


Asp 


Tyr 


Arg 


Ser 


€5 


Phe 








70 










75 


Thr 


Lys 


Thr 


Asp 


His 


Phe 


Thr 


His 


Ser 


Asp 










85 










90 


Leu 


Tyr 


Ala 


Leu 


Arg 


Asp 


Ser 


He 


Gly 


Thr 


Pro 


Gly 






100 










105 






Leu 


Glu 


Pro 


Asp 


Leu 


Lys 


Trp 


Glu 


Arg 


Phe 






115 










120 






Leu 


Leu 


Ala 


Glu 


Arg 


Leu 


Gly 


Val 


Arg 


Gin 


Asn 




13 0 










135 








Arg 


Pro 


Asp 


Gly 


Arg 


Ser 


Ala 


His 


Thr 


Thr 


Asp 



15 

Arg Gly Pro Val Lei 
30 

His Ala He Arg hex 
45 

Glu Leu Val Ala Se: 
60 

Arg Arg Pro Leu Met 

80 

Asp Pro Glu Leu Se: 
95 

Phe Leu Leu Leu Ale 
110 

He Thr Ala Val Arc 
125 

His Arg Pro Gly His 
140 
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145 150 155 160 

Phe Gin Gin Pro Gly Ala lie Ser Asp Phe Gin Pro Phe Asp Leu 
165 170 175 

(2) INFORMATION FOR SEQ ID NO: 297: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 297: 

Lys Pro Val Lys Glu Pro Val Pro Ala Leu Pro Pro Val Pro Pro Thr 

15 10 ' 15 

Pro Ala Leu Pro Pro Leu Pro Pro Leu Pro Pro Val Pro Gly Phe Pro 

20 25 30 

Thr Val Pro Pro Pro Gly Ser Met Ala Pro Leu Phe Arg Pro Phe Ser 

35 40 45 

Pro Ala Pro Pro Ser Pro Ala Leu Pro Pro Ser Pro Pro Leu Pro Pro 

50 55 60 

Leu Val Gly Val Ala Ala Trp Leu Thr Tyr Cys Ser Thr Gly Pro Ala 
65 70 75 80 

Leu Asp Pro Leu Ala Val Ser He Ala Ala Ser Met Asp Pro Pro Thr 

85 90 95 

Thr Thr Cys Glu Ala Ser Pro Ala Ala Ala Ala Ala Gin Leu Cys Arg 

100 105 110 

Gly Ser Cys Asp Leu Ala Pro Ala Asp Glu Met Met Gly Thr Thr Gly 

115 120 125 

Ala Cys Gly Arg Leu Gly Glu Ala Ser Ala Gly Ser Arg Ser Arg His 

130 135 140 

Thr Arg Arg Cys Ala Ala Ala Ser Glu lie Cys Arg Leu Arg Cys Thr 
145 150 155 160 

Arg Ser Ser Ser Gly Val Pro Arg Asp Trp Val Ser Pro Leu Ala Pro 
165 170 175 

Pro Leu 



(2) INFORMATION FOR SEQ ID NO: 298: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 921 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 298: 

AATTCGGCAC GARCAGCACC AACACCGGCT TCTTCAACTC CGGCGACGTC AATACCGGTA 60 
ZCGGCAACAC CGGCAGCTTC AACACCGGCA GCTTCAATCC GGGCGATTCC AACACCGGGG 12 0 
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ATTTCAACCC ANGCAGCTAC CACACGGGGA CTCGGAAACA CCGGCGATTT TACACCGGCS 180 

CCTTCATCTC CGGCAGCTAC AGCAACGGGT CTTGTGGAGT GGAAATTATC AGGGCTCATT 240 

GGNTGCACCC GGSCTTRCGA ATCCCTCGKG CCAATTCAAC TCCTCNACAA GCTTGCGGCC 300 

GCACTCSAGC CCGGGTGAAT GATTGAGTTT AACCGCTNAN CAATAACTAG CATAACCCCT 360 

TKGGGCCTCT AAACGGGTCT TGAAGGGTTT TTTGCTGAAA GGANGAACTA TATCCGGATA 420 

ACTGGCGTAN TACGAAAAGC CGCACCGATC GCCTTCCCAA CAGTTGCGCA CCKGAATGGC 48 0 

AATGGACCNC CCTKTTACCG GSCATTAACN CGGGGGTGTN GGKGTTACCC CCACGTNACC 54 0 

GCTACCTTGC CANNSSCCTN RSGCCGTCTT TCSTTTCTTC CTTCCTTCTC CCMCTTCGCC 600 

GGTTCCCNTC AGCTCTAAAT CGGGGNNCCC TTTMGGGTTC CAATTATTGC TTACNGSCCC 660 

CCACCCCAAA AAYTNATTNG GGTTAATGTC CCTTMTTGGG CNTCCCCCTA WTNANNGTTT 72 0 

TCCCCCTTNA CTTTGRSTCC CTTCYTTATW NTGAMNCTNT TTCCACYGGA AAAMNCTCCA 780 

CCNTTYSSGS TTTCCTTTGA WTTATMRGGR AATTSCAATY CCGCYTTKGG TTMAANTTAA 84 0 

CYTATTTCNA ATTTTCCCGM TTTTMMNATR TTNSNCKCGM KNCTCCNRKA SSGNTTTCCT 900 

CCCCCYTTSS GKTYCCCCRN G 92 i 

(2) INFORMATION FOR SEQ ID NO: 299: 



ii) SEQUENCE OIARACTERISTI C3 : 

(A) LENGTH: 1082 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 299: 



AATTCGGCAC GAGATANGGG CGCACCGGGG TCCGCAGCCG GCGGGACCGT CGCCAGCACC 60 

ACCGGGGTCA ACAGCACCAC GGTGGCGTCC ANGCAGAGCG CCGCGGTGAT GGCGGCCGAG 12 0 

ACGGCRAACA CCTGCCGTAG CAGTCGGTGC GACTCCGCGC TCGCTCGANC CATGGCCGCG 18 0 

CCGGCTGCCT CGAACANGCC TTCGTCGTCC ACAGCTTAGC CAGCANCCAA ACCGCACCCA 24 0 

GAAACCCACA CGCCCGCCGC CCCGGANACC TGCGCCATCG KCTGCTGGGG CGANATCCCC 3 00 

CGATCGCTNA CANGATGACC GCTGCCGGAA CGCCGCCGCT GCCTCCGGGC AGCCGCGTGG 360 

GCSGGGCAAC CGCGAACCCA NGAACACGGC AAGCAGTATC ANCGCAACAG CAATTGTCAA 42 0 

GGGCTAAACG CTTCACATCC AGGGATCTCG CGGCGCCACA CCGTCGGMTC TGCAGSGCGA 48 0 

CCCCNTCCTN GGGCGGNCAC TCNTCAAAGA TGCNGATCNA CAGKCTAGGT CTTCGGCCGA 54 0 

TATGSAAGGN CCCAACGGNT TTAAAGCGGC SAAAAAASTC TCCCANTGGA TAAAATCAGC 60 0 

CGGGGANCCC CCCGTGSCMM NGTCYCGGKC ATTNTTCAAC MGGTTTNACG GCGGKTGCNG 66 0 

GCCAACTKGC CAAAMTTAAG KTNGGGGNTY CGGGGCGGTA ACCGGCNNTK NGCCCCTTAA 72 0 

AAAACCGGNC YTTTCTKGAT TAMMACCGGN CCCCCAWTGG CGGKTGKTCC CANGNTYAAC 780 

.^CCYCCCSS MNGGGKTGGS SAACCCTTCC CGNGGGGTTC NTKGTTSCYT AWMCCCCCGG 84 0 

AAACCSGKYG GGKTGGCRTN WASSAMNCCC CMNGYYTCTT TAAAGGCCAN KNRAAWGKYT 900 

CCTTGGGAAW CCTNCAATYC GAAAAYYCTC CTYMMGSSCN CTTKCWRTYN NRNGGGAACS 960 

AMWTNYCCNC GWTTCAWTCG GGTCCGASMN AAACKCTTTY TTTTYCGSSC STCCMGGSNC 102 0 

SGGTKNANAN AAASATTTMC YYCNNNANKK YYYCSSGCTT CYKMGRRNRR GMGAACCCGR 108 0 

1082 



(2} INFORMATION FOR SEQ ID NO: 3 00: 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 990 base pairs 

(B) TYPE: nucleic acid 
iC) STRANDEDNESS: single 
(Dl TOPOLOGY: linear 
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(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:300: 

AATTGGCACG AGTGATCGCG CTGAAGCCGG TAGCGCGGGT GGCTCGGGTG GTTTGCGAAC 60 

RAAATCCGCT CGANGTGGTC TCGGTAGGCG GTGTCCANAA CGGTGGCGCG GTGCCGGCGG 120 

ATCTGATCGG CGCGGCCGTA GTGCACGTCG GCGGGCGTGT GCAGTCCGAT GCCGGAATGC 18 0 

TTGTGTTCGT GGTTGTACCA GCCGAAGAAC CGGTCGCAGT GCACCCGGGC CGCCTCGATC 24 0 

GACTCGAACC GTTTCGGGAA ATCGGGCCGG TACTTGAAGG TCTYGAACTG GGCCTCAGAC 300 

AACGGGTTGT CTTGCTGGTG TGCGGGCGTG AGTGCGACTT GGTGACACCG AAGTCGGCCA 360 

NCANCAATGC CACCGGTTTG GAACTCATCC ACAACCCCCG TCCGCGTCMA GGTCACTTGT 420 

NCGGCGCTAA TTTNYTGGGC GGCAAGGGTT TGCCGAYCAN KCCGCTCGGC CAAAACTTCG 480 

ANTCNCSCCA AGGCCNCCAT CCNCCCAAAC AMGTTACGGG ANAAAANATY CAAAGAYCAC 54 0 

CYTCCGGKTN TTATANCTYC CCYTTTGSTY GGGCCCCCCN CYYTGKKNAT ACCCCTNCCA 600 

AWTCCCAACN CCCKCCAANA RCYKGGGGCC CCCNCCAACC CGGGKGAAKA WTAATTTAAA 560 

CCCYAACMAW ACTWMMNACC CNNGGGSCCY AAMCGTYYNR AGGTTTTSCT NAAAGAAASA 72 0 

ANTCGGAAMC CGGNTSTACC AAAAASCCCK CCNWTCCCTC CRASATTGSC NCCSAAWKSA 78 0 

AKGCCCCCNY TCSGCNWNNC CSGCGGKKKT KKGTTWCCCT WMRCWMWYTS GGCCNASCCN 340 

CKYYSSMYCC CCCCTCCCCM CTCCGNKTCC CCAMCCYANC MGGCCCCYTM GKKCCCWKNT 900 

YKGCCCCCCC AMMNNNGGGG WGACCCTNGG CCCCMKRRGM TCCCNANTGA MCCTCWGNRA 960 
MKCYCCNRAR ANMCCSCNCC NGCNCRCKNN 



990 



(2) INFORMATION FOR SEQ ID NO: 3 01: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 223 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Genomic DNA 

[Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 01: 



-^TTCGGGTG GCAACGCGGG CCTGTTCGGC AACGGCGGCG CCGGTGGTGC CGGTGGGGCT 

GGTGGTGGCG CCGGCGGCGC GGGCGGTAAC GCGGGGTGGT TTGGTCATGG GGGCGCTGGC 

GGCGTGGGTG GTGTANGTGC GGCCGGGGCC AACGGTGCTA CGCCCGGTCA GGATGGGGCG 

GCTGGTGTTG CCGGGTCGGA CRACRCTCGT GCCGCTCGTG CCG 

i2) INFORMATION FOR SEQ ID MO: 302: 



(1) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 418 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 3 02: 

AATTCGGCAC GANGCGGCAA CGGTGGCAGC GGCGGCACGT CNGTTGCCAC CGGGGGGGCC 

GGGAACGGCG GTGCCGGCGG CGCCGGCGGC GGGGCCGGGC TGATCGGCAA CGGCSGCAAC 

GGCGGCAGTG GCGGAATGGG CGATGCCCCG GGCGGCACCG GCGTCNGCGG CATCRGTGGG 

CTGTTGTTGG GTTTGGACRG CGCCAACGCC CCGGCCAGCA CCAACCCGCT GCACACCGCG 
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CAGCACAGGC GTTGGCCGCA GTCAACGCGC CCATCCAGGC CGTGACCGGG CGCCCC7GAT 300 
CGGCAACGCG CCAACGGCGC CCCGGGCAAC GGGGCCCCCG GCRGGCACGG CGGGTGGTTG 360 
TTCGGCGGCG GAAGGAACGG CGGGTCCGGC GTCANCRGCG GGGCGGGCGG .\AATGCCG 41B 

(2) INFORMATION FOR SEQ ID NO: 3 03: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 303: 



AATTCGGCAC GAGGGGCACG ATCGCATACA GCGCTCGCGG C^GACCCGCC CGATACAGCA 
GCTCGGCACA CGCGAGCGCA CAATACGGCG TCTGGCTGTC CGGCTTGARC ACCACCGCGT 
TACCGGCCAC CAGCGCGGGC ACCGAGTCCG ACACCGTAAG CGTCATGGGG TAGTTCCACG 
GCGAGATCAC CCCCACCAC3 CCCTTCGGTT GATAGCACAC CGTGGTCTTG CCTATCCCGG 
GCAGCAGCGG CTGTGCCTTA CGGGGCTTCA GCAGGTCCAC ACAGACTCGT GCSTTATAAT 
TNCGCSTTCC GCGATCAGAT CGACAATTTC CTCTTGCGCC GCCCATCGGG CCTTGCCCGC 
CTCGGCTTGC AGGAAGTCCA TGAAGAACTC GCGGTTCTCG ATNAACAGGT CGCGATAGCG 
GCSGATGACT GCAGCTCGCT CGATt^ACGGG ACCTTCGCCA GTCGGTCTGC GCCGCGCGAN 480 
CTTCCGCGAA TGCCGCTTCG ACTTCCGCGG NCGTGCCAAC GGAATCNTAT CACGGGTTGC 
CGGTTAAAAC TCCTC^TST NCYGGTCGAA ATTCGGCAAC TTCTTATCCC GGCAGGTRCC 
AACSANNCAA ACCTCGGCAA GGTTAGGMTT TCCCCCNCTT YCAAAAATNC GGKTTTTGGN 
CMAATTTCGC CXCNATGKTG MCAAGGMTCT CKAANAAKCS GGGTCYTCTN NTCNGKGGAK 72 0 
CCAAAMGGKT TTGGGGMAGC GKNMNCCAAN CCTWACCCTG KTKAANGGNW TTCCCCCCGG 7 80 
GGGAKKGNGA ATYCYCCSNA NCCCRGGGGG GNMCARATTC TYCCGGMCTC CTCXGGAWTC 84 0 
WGMGSTTTCC CAAAAAACSC CCCAAATTMM TTTTTCCHCN TRTTGANACW CTTTTKARCA 90 0 
MMCSSAARNS ANMCNCTCYC CKCTKTGKTK AAAAAGNAYW CCCCMAAATT TYTAWTTSSC 960 
CCSCGCGGGN CCCNCTNTTT TSCNMTWCTM WNYTNCRMCC MMMSNCKSNG KKGGNRCCNN 102 0 
CRCCSNCCCM AAWYNTKGYN KNTATMAGC 



i2) INFORMATION FOR SEQ ID NO: 3 04: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1036 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



5C 
120 
130 
240 
300 
360 
420 



540 
600 
S60 



1049 



(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 04: 

AATTCGGCAC GAGGGAATCG AGAATCCCGG AATGGTGAAG CCTCGGTGCC TGCCGTTACG 60 

CCAAGAKTCA GGGTGAGCGG CCCCCCGGTG GGAATGCTGA SGCCAACCGG GAAAAGGGTG 12 0 

AGGGCTGGGG TGGAATAACT GAANGTTACT GGGATGGAAA ACCCGGTATT GATATGTATI 180 

GGGCCGATCA ANGTTGTGGG .:\ATGGGGGAA GGCTGAGGGC GACCTGTTGG ATTTGGGGAA 24 0 

..GTYRTGGA CRAKACWGGC CAGCCMGCGT GATGGTTTGG TTSAANTTTT GTGCCGSCCA 300 

CANGGTGATG GGATTGATT-T TGATGGGGCC SATCGAAATA TTGGGTATGC CNACGCCSAA 3 60 

CGAGATYGCC GGGACGTTCA VQGGCGGGAC AACCMASGGT CCSANGTAAK GGTTTCCTTN 42 3 

ATNTTGATCG GGATTCCGGA ACTMTSTCGA TGSGCTCSAY MTSATSGCCC NACNCCWCCG 48 0 



wo 99/42118 



196 



PCT/US99/03265 



rrTATTTCMS GCTJAYGGGA ATBAMRGGAA CAAYNTCCCT CCCMGGAAAA ACCAACMSGC 
CCTC3GTNSYC CNCCCRCCNC AKAACCCRTT KCTGTRSTMC CCSMAAATNA CSCCCSCTTS 
NACTCCNCSG AANTWSCCCC CCCSCKNNTr ATSTYCCCGK GTTCCCCCMC CCCTTNAAMC 
TCCCCGGTTA ACCCCCWTNT SNCNCCCCCS YTAAKMNCRG GCTTSTTNCT CCCCCYTRMK 
CNCCCCCTCK SAMCWNCCNC CTCXAACMAC CCCKCYKGSM TNCCCAATNT WCMWCKC-VS 
KTTNTMCTKC CCAAYTNCRC CCNCRCTCCC CCKSTSTCAM WTATAAAACC WCWYA^mJilK 
KCNCWMAWTA MGACWCTCNY NCCCCNCNCK NTTKTAMWCC CKMCCCKCSW TWCYCKCSCC 
mSJS^^^ ^^CCCCKKTY NKWMCCCTTC CCCCCCTCCC MCNMBMICTCT YCSGKTWCWC ..u 

^CTCTTCCN OmrCTCCCC CC.CCCCCCV KKCTCTSKCC 1020 



(2) INFORMATION FOR SEQ ID NO: 305: 

(i) SEQUENCE CHARACTERISTICS: 

(A) I,£NGTH: 1036 base pairs 

(B) TYPE: nucleic acid 

(C) 3TRANDEDNESS : single 

(D) TOPOLOGY: linear 



540 
600 
660 
720 
780 
840 
900 
960 



1036 



(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:305: 

AATTCGGCAC GAGATCATGA ATAGCGGGCT GGTCAGCACC GAAGTGGTCG GCGATC^CGC 
GAGCAAGTCT CGTCTGCTCG CCCAGCAGGA GGTCGGCATC GATGCGGACA CCTGCG^TGT 
CTTGGATGGT GTTCAGTTGC AGGTAAGGCC GACGCCGCAG CTTTGCTAGC AGGGTGTC-T 
GGCTCTTCGC ACGTGAGGTA ACCAATAACT CCGACGCAGA CCAACTCCGG CCCTCGATCC 
GGGTACCAGG CTCCGCCGGA GCCAGCCGTT GTGCCCCCTG GGCCGAAGGT CAGCTGCTGT 
^ TAAGAAACCG CGCCATGCCC GTCGCCAAGT ACGACTGACC GAGCAAACGA 
TCCTTTCCGT GGGGGTAATC GANCCCAGCA ACCGCACGAG CCACCAATCA 
GCCACTGACC GACCAACCGC CTGTGCGAC^ CCCCAGCGGA ATTGGTGGTC 
.xC.o^^GGG CCGCNAACGG AATCANCGSG ACGCGCTCGC CGAASCANCC GCATANC-NT 
GGNNTCTGCG CCCACATTTC GGGSTIMTGC CCCTCNGCAA CSSNAAyiJcC 
.CCAAT.CYG .^CNAAAAAA TTGGYCCATY ARNGTYCTCM CCAAAAACCN ^WTCCCCKTA 
TCCCCCGGGG GGGRCCCCYY NMNAAAAC3G CCCWWAANCC CCSGGGCSCC CGGGTTRWTN 
.CCTGTCG GCCCNCCSGG TTTGGTCMCM GGSCMMTNWN GGGNTGCSCC CCCNC^IAAAA 
^A^^^iS l^''^'^ CCCKYCMAAA ASKTGGGSSC CCCMARCCGG GGKAAKKWWA 
ANTTAANCCN KAAAAAAAWW NCANNMCCCC NGGGNCCTAA GGKYTTAGGG GTTSTTNANG 
A^AAAATMTC CANATMNSSK TTNNAAAAAA ASCCSWAKCC CCCNNNKKNN CCAAWKAARR 



(2) INFORMATION FOR SEQ ID MO: 3 06: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1060 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

ill) MOLECULE TYPE: Genomic DNA 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 06: 
-^TTCGGCAC GAGTCGATTC GATCGAACAC GCCCGCACCT GGCCAGGCCA CATGGGCGCG 



60 
12 0 
130 
240 
300 
360 
420 
480 
540 
500 
560 
720 
780 
340 
900 
960 



rrr^rr^r^^ KKKKiCrNCMS KMNMMTTWGR CCCNCCGCCN NNTWKCCTTN 1020 

i U t-XJ X UCjNG C RN CAGN 



L036 
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GCCATGGCCA ACGCCTACTC GGCCAACCCG AATCCATTCG GCGTCTCACC GCAACCCCCG 120 

AAACCGGCGA CCGCGGCATG GATCAACCCG CCCACCCCAG ATCCGAAATA GCGTCCACAT 180 

AATGAGACAC TGGCGCAAAG AGCTTGACAG GCGCCGCACC ACGCAAGCTG TTAGACGTGT 240 

CGGTCTTGCA AGAAGCGGGT TGGCCACCCA AGATCACGCC GCCCAAGGGC ATCGAGTCAA 300 

CGTTGCGGTG OTArCGCGCT AACGTCGGCG CCGCCAAGAA ATGACGGTGC GCATTACCAT 360 

GGCCCTGCTG ATCACCTTTG GCCACCTGCG CACCANAACT ATGANCAGCC TTATGCCGAG 420 

TCTCGTGGAC ATCGGCAGCC GCTTCAAAAA CTCCTTGTCG ACAATSGTAT TGCTCANCCG 480 

CCGAATTCTT NTRCTTGCAA SAACACTNCA TGTTNCSGGT NAACAACCYT GGTTNGAAAA 540 

ACANCCAATA TTGAANTCCC ANTCGGGCAM GAACC2JGTTM CGGAAGKTGK TGGGAACGAA 600 

TGKTGCCCAA AAATCCCGGG NGGTRAAAWW CCCNSNATGG MSAATTTTSC CTOGAACAAM 660 

AAAAGGTCCA AGKYCAAAGG NGCCCCCCCC SGNAAATTGG TGAACSCAKA WYANRTTCCC 720 

WWWTNCAAAT MTTNGGGTCC KUNTCCCCWT AAANGGGSCN CCCCNCCRGG GMGTYTCCCC 780 

NWNMGGGMGN CYYCSCCCCA AAAAAAAMMM MTTTCSGKGG SMGGKKCCCC CCSGGTYWGG 840 

GKKYTTAAAC CCGGKGGGTO CAAAAAANAN ACCCCCCAMS NGGGGGGAAA ATTTGIIAAWT 900 

AAGGKKKTKC SCMACCCCAA AAANMMNNCN AWNCCCGMGK SARGGGGRNY TTMKAGGGMG 960 

YCGGGGGGNA NAAYAAAAGK NGSNGRGAAT NTITOTGK RSSSRNKTTT 1020 
TYNTCCTYCN CCNMGNRWWG SRAMNTGKTS NSSGGGSGGC 



(2) INFORMATION ?0R SEQ ID NO: 3 07: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 1040 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



1060 



(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:307: 

.^TTCGGCAC GAGCTTCACC .^GAGCTGA CATGCCGGGT GATGCGACAT CGCATCGAGG 
GCAATACGGG CATGGATGAN CCGAANGGAN TCTGGCGTTC GCTCAACTGG ATTACGGTTC 
CCAAGGTGAA ACGCTTTGCG GCGAAAGATG CGACGCTTAA CTTGCGCITC CACCGTGCAA 

GATGCTGGAA CCGCGCTGAC NGATAANGAA TTCGCTGGTC GCCGGGCACN 240 
CKSTTTTCNC TCCGCSGTTA AATTGCSTGT GCATCATCTG GCAGGCTATG 
RCTGCAGCCC ATCATGGATG TGCGGCTAAC GAANAAGTTA TGACATGGCG 
^G^GAMTC GGGCATSCNC GCGGCAMTTT CGCAACCTGC TGTGTOTGAA GCGTMTCAAC 
CGAATGCGGC GC-ZAAAAGC 5IGGCTTGCGT TGATTMMAAC CNAACCCNTN CNATYcS^G 
CCGNGNMNTG CGTTCTCTCC .^CTCCGKKG SYTGCCNCCG TGAAACCCMA cSccS^C 
GTTGGACTTA MRTNTTCAAA AAMCGGMTNA ACCSGAATNN SAACCTNCCR TCAAANTAMM 
c^^"" TTYGGGNRCC CCCCNGAAYW TTCKNCNGGG GMNNTYCTCN GGTTYNGGCG 
^"^^ CCRTNCYMNN TTTACAMGGC NCMTNMTTGM GGGSCSNNAS GWCCCGGGKK 
TNTTTNCAAW TCNCNSKTTT TTKGGGGGGG GGCYGRTRMC NCGGGCCCCC GGCCCKKMAA 
^^Sc Zl^^'S^'^ KKCCCCCCCM NNATNGGGCG YKCRAAACAA ACCCCAANRA 
^xrM^™ SMACCSGNGN GYNAAAKGGT TSNSCTMANM MKGMANNNCT SGMSCCMNSN 
NCTGMGGGKT TTKGNNGARN AANAMKMGGM RCGGNCGCNN GAAAGGGSMS GSCKSCNNGN 

Ts^Zl ^^11 — W-C 1020 

1040 

(2) INFORMATION FOR SEQ ID NO: 308: 

(i) SEQUENCE CHARACTERISTICS: 

(A) liENGTH: 348 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: sing:Le 



60 
120 
180 



300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
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60 
12 0 
180 
240 
300 
348 



(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 308: 

AATTCGGCAC GAGACAANGG CGTGAAATGG GATCCGGCCG AGCTGGGGCC CGTCGTCAGC 
GACCTGTTGG CCAAGTCGCG GCCGCCGGIT CCGGTCTATG GGGCCTAG?? aSSScg 
J^n^^^f CAGGGCGAGA I^CGGCCGT TTTCTCGCCC TGGCTTC^CG SSSS 

tkgggaacgg TCAGGGTTCG caaaccacga tcgggatcgt gcggtcggtc caggact^ 

ANTCCTGATA CTTKGGTACA TCGTGACCAA CTGTGGNCAA TATTCGGCGC SSSS^ 

ngtcgcgtcc cgcgcggtaa ggtccancac TTCCTrrrrc tcgtgccg ^""^"^^ 

(2) INFORMATION FOR SEQ ID NO: 309: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 09: 

AATTCGGCAC GAGAGACCGG GTCGTTGACC AACGGACGCT TGGGCGCGGG CCCCTTGCGT .0 
GGCATCAGCC CTTCTCCTTC riAGCGCCGT AACGGCTGCG TGCCTGtS cS^SgI 
^"^5!^^"^ ATCCAGCGAA CCGCGGATGA TCTTGTAGCG CACAcSSc S^cSS 

'g^SS'^ gS^c'S ^^^^^^^^^ OCTCCTGCAG GTTGTGGCCC tScSS 
.GTACGCCGT GACuTCGAAC TGACTCGTCA CTTCACGCGG GCAACCTTCC GAAGCGCrGA 
GTTCGGCTTC TTCGGAGTGG TGGCTCGTGC CG ^^^^TTCC GAAGCGCCGA 

(2) INFORMATION FOR SEQ ID NO: 3 10: 

I'i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 962 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: l:Lnear 

(ii) MOLECULE TYPE: Genomic DNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:310: 

^^'gSc SStp^^'^ agacggattc aatgctcccg cgagcacctc gccactgcac 

CCGC^gSS ^rTr^rl ^^^^^^^^ AACGAGCCCT TCCAGACGCT CACCGGCCGC 

ccgctgatcg gcaacggcgc caacgggact cctggaaccg GGGC^GACGC GGGGCCGGCG 

ZTooc^Z gSc"??^"''' '"^^"^^^^ '''''^'^ ZT^c^ 

CgSSS^ Sr^rSnn^ GGATITCrrc GCACCGGSGC ACCGGCGGGG CCGGCGGCGT 

cgcacaacgg caccggcggg gacgcngcgc ccgtngggcg gcttctkgat gggctccggc 

CG™™ c^n^n '''^^^^ ^^^SSI CGcSaS 

CGATCTTCTT CCGCNCCCCG GAAACCGCC3G GGCCGGCCCC ACATTAKACC CGGrGoiarr 

M-?S^'2 ~™ ^^^^KCTANC YYAATCCCCG ANGGKTGAMC CTSATGSNCA 
M...MAG.AA ../TNCCCANT KTTSGRACCW CRCCNGGAAA ASRAWNKNGT KGGCAAACNA 



120 
180 
240 
300 
332 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
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NNTNCYTTKN NATTKGGNNA AAAANCCCTY CCWCSGRACT NCCCCCCNGM GRGMCNNTON 720 

NTTTYGNCNN CCCGGSNAAM RNTTKATTTC NGGGGGNTCN GGGTKMNNNA AACCCCAAAM 780 

MNI2UNKCSCA ANGGGKSNGC MKNNMMNSGT TTTYCKNMRA MRNWTYKNKN NTCNGAIISRN 840 

NAAMCNNSNK NGKKKNNKAA ARIJNTTWKTN KNSCNNNCNN GRRNGVRGGC CKMKGSNMNG 90 0 

MOfflNAWRNG NNGSNCTCKC NNKMNAAAAA AASGGVNCKS NSMKNKKKKG NRGGGGGGGG 960 



GG 

(2) INFORMATION FOR SEQ ID NO: 3 11 



962 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 11: 

AATTCGGCAC RAGAAGACGC CCGAANGTTT GCGCTXMCTC TACAACTTCA TCAARGCGCA 60 

GGGGGAACGC AACTTCGGCA AGATCTACGT TCGCTTCCCC GAAGCGGTCT CGATGCGCCA 120 

GTACCTCGGC GCACCGCACG GCGAGCTGAC CCAGGATCCG GCCGCGAAAC GGCTTGCGTT 180 

GCAGAAGATG TCGTTCGAGG TGGCCTGGAG GATTTTGCAN GCGACGCCNG TGACCGCGAC 240 

GGGTTTKGTG TCCGCACTGC TGCTCACCAC CCGCGGCACC GCGTTGACCT CGACCAGCTG 300 

CACCACTCGT GCCGCTCGTG CCG 323 

(2) INFORMATION FOR SEQ ID NO: 3 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1034 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Genomic DNA 

ixi) SEQUENCE DESCRIPTION: SEQ ID NO:312: 

AATTCGCAGT GTGTGTGGCG GCGTCCAGAA GAAGATGATC GCGAACATCG CCAGCGCCGG 60 

CCAGGCTATG GTGCCGGTGA TGGCCGACCA GCCGATCATC ACCGGCATAC AGCCGGCCGC 12 0 

CCCACCCCAC ACCACGTTCT GTGACGTGCG TCGCTTGAGC CAAAGCGTGT AGACRAACAC 18 0 
ATAAAACGCG ACGGTGACCA GGGCCAGCAC CCCCGCCAGC AGGTTCGTGG CGCACCATAG 
CCAGAAGAAC GAGATCACCG TCNACGTCAC CCGAGTGCCA ACGCGTTTCG GGTCGGCACC 
GCTTCCCGCG CCAAGGGCCG GCGCGCGGTT CGCTTCATCA CCTTGTCGAT ATCGGCGTCG 
GCNACCAGTT GAGCGTGTTG GCGCCGGCGG CSGCCATCAT CCCGCCGACN ANCGTGTTGA 
GCATGANCAG CGGATGAATG GCGCCGCGGC TCGTGCCGCT CGTGCCGAAT TCAACTCCGT 

CNACAACTTG CGGNCGCACT CGAACCCGGG TGAATGAWTG AATTTAAACC GSTSAACANT 54 0 

AACTACATAA CCCTTGGGGG CTCTTAACCG GTYYTGAANG GGTTTTTTGC TTAAAGGAAG 60 0 

AACYATTTCC GGATANCTGG CSTTNWTARC GAAAAGGCCC CRCCCATNGC CCTCCACAGT 660 

TTSCCCCTGA ATGGSAATGG MNCNCCYKNR CNGGGNCTTT AACRCSGGCG GGNTTTTGKT 720 

MCCCNNCTKA CNTTMMMTGC ARNNCNGGCC SKCCCTTCCK TNTYCCCTCC NTCCCCCNST 780 

TNCNGKTCCC CNNAMNYTNW ACGGGGGGCC YTNGGGKCRM TWTKKTTTGG GCCCCMCCCC 84 0 

MAAANASAAN GGGGKRNGTY CSTTTGGCNC CCCAMAARGG NYCCCCCCAM YTNRRKMCSY 900 

CNNTNKGGNN CTGTNCKNCG GAARAMAMCC KCCCCGNSTS STTNGTYWAG GNRWKGNSRG 960 

CCSCCCZ:^'£ MNNNAAYAWN WMNATNCNNS STNANMAKKN NNNNNNNSCN WGNGNNTCN 1020 



240 
300 
360 
420 
480 
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SCNSNGGKBC CSCC 

(2) INFORMATION FOR SEQ ID NO: 3 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 331 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

{Xi} SEQUENCE DESCRIPTION: SEQ ID NO: 3 13: 

AATTCGGCAC GAGCCCACAT CCGGGGCCGC TCGTTGCATG ACTCGTTCGT CATCGTC3AC 
RAGGCACAGT CGCTGGAGCG CAATGTGTTG CTGACCGTGC TGTCCCGGTT GGGGACCGGT 
TCCCGGGTGG TGTTGACCCA CGACATCGCC CAGCGCGACA ACCTGCGGGT CGGCCGCCAC 
GACGGGTCGC CGCGGTGATC GAGAAGCTCA AAGGTCATCC GTTGTTCGCC CACATCACCT 
TGCTGCGCAG TGAGCGCTCG CCGATCGCCG CGCTGGTCAC GAGATGCTCG ANGAGATCAC 
CGGGCCGCGC TGAGTGCGCC TCCCGCGAGC A 

(2) INFORJ^.TION FOR SEQ ID NO: 3 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 102 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

■:xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 14: 

AATTCGGCAC GAGATCGTCA CCCTGGCGAC CAGTGCACCC 
GCTGATGGGC CAGAAGATGG ACCAGGTGCT GCCCATCCCG 
CACCGGGATC GCGGTCCTCA GCTACGGCGA TRAGCTGGTG 
TGACGCCGCG TCCGAAATGC AGCAGCTGGT CAACGGTATC 
GGTGGCGCTC ANCGACAATT CCGTGCTGCT GTTTACAAGG 
CGCGCACTCC CCANCGCCGC GCGGCSGGGG CGGCCCTCTG 
CACTGACGCC ATCTCCGTCG GCGTTAACCC CGTGAGAAGG 
CCCGGTCACC ATCNATCCGC GCCGCCATGA CGCNGTGCTG 
CCCCCAGGAA CTGGTCCGGC AMTNCAGGAA NTYCGTGTGG 
GGCYTAAACT TCCNATSTTN CSGCSGGCCT CTGGCGTTNC 
ATCGGSMMAA ATCCCCANMC AAACCCCCCG GGTCTTGSGG 
AAACCCCCCC NTTAAANTCT TTGKTNCCNN CNCSGGCNCC 
NCTTCCCCCC CCCAWTTTAA CCGAKCGSCN AAYCCCAAGY 
AATTTGSCSG CCCCAANTAA ATTCCCNGGC CCYTTGGGGG 
TKGNNNAAMC NGGANCCSGG KAAYTMMTKG NAAYCGCCSN 
YNCCCSGAAA ATTNNAMAAM CMNNKTGSNG GGGGKTTSNC 
SKTTNMCNNN SANMNCNSNN SGGNSNNNNN NNNCNCGYKC 
CCMMCC 

(2) INFORMATION FOR SEQ ID NO: 3 15: 
(i) SEQUENCE CHARACTERISTICS: 



AGGCCACGCC ACCAGTTACG 50 

CCCACCGCAC TGCAGCTGAG 12 0 

TTCGGCATCA CCGCTGACTA 180 

GAACTGGGTG TGGCGCGTCT 24 0 

ATCGGCSTAA GCGTTCATCC 300 

TGCCGACCGC CCGAGCGCGT 360 

TGGGTCGTGC GCAAGTTGGG 420 

TTCCACACCA CNTSNGACNC 480 

GCACCNGCTT CTTCCGKTRT 54 0 

GNCCGGGCCG NTCTTNCCAA 600 

GCSGGGNGGC GGCCNAWNCC 660 

NCNAANSCAN CCCTTTKGGC 720 

TMMGKCCYCY KNAAAAAAAA 780 

CGRANCNYNT TTTMCCSNSS 84 0 

AAMBNTTTTC TAANNCCCCN 90 0 

SGKKGRAGGM AAAAAANRSN 960 

CSNAANMCCC CGCGGGGGGG 102 0 

1026 
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201 



60 
120 
180 
240 
300 
324 



(A) LENGTH: 3 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECJLH TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 15: 

AATTCGGCAC GAGAAGACGC CCGARNGTST GCGCTGGCTC TACAACTTCA TCAARGCGCA 
NGGGGAACGC AACTTCGGCA AGATCTACGT TCGCTTCCCC GAAGCGGTCT CGATGCGCCA 
GTACCTCGGC GCACCGCACG GCGAGCTGAC CCAGGATCCG GCCGCGAAAC GGCTrGCGTT 
GCAGAAGATG TCGTTCGAGG TGGCCTGGAN GATTTTGCAN GCGACGCCNG TNACCGCGAC 
GGGTTnCGTG TCCGCACTGC TGCTCACCAC CCGCSGCACC GCGTTGACGC TCGACCAGCT 
GCACCACTCG TGCCGCTCGT GCCG ^^^^^ 

(2) INFORMATION FOR SEQ ID NO: 316: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1010 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 16: 

AATTCGGCAC GANGCGTGCC GCTNAACACC AGCCCGCGGC TGCCAGATAT CCCGGACTCG 60 

CGGTGGCGTC GTTGCTCTCC TGACGGGGCG CGGCGACCAT AAGGTCGCTM 12 0 

ATGC.CAGGx AGCGGCCCAG GTGCATGGAG TCGATGATGA TGCGACTCTC CAGCTCGCCG ^ 80 

ACCGGGAGCT TGGCATCGGG CCTGATCAGC CAGGACGCGT AGGACAAGTC GATCGAATGC 24 0 

--.TAGxGoCCT CCAGAGTGGC CGTGCAMTTC CNGCGTGCTC CACGGCAAAT GCCTTGArrT 3 00 

TANTGTTCCC GCATCGCCTG CGGGATGAAT GGGAACCGCA SGATGGCGAC 360 

GANCTCAGGT TTGCCGCTTT GCGCACAGTG GTCNACANCC GGTACTCGGC 420 

CCCNAAATCG GCGCCGACGG CGCCCACNAT .z^AACGGGC ACNACAATCG 480 

CACCCNAACA ACANCTTGSC ATCGGATTTT GTCCCCANCG CTCAANCCGT 540 

TCNTCCGGCG NACTTTTCTT NNAWTAACTG CCGCTTCCGK CCCTGGNGCA 600 

AACCCriTTCC CCACCTTGAA GGGGTTGTTG NATTTITACT GSTAACCCCG 66 

GANTCGGTCN KCCGGGSTTT YstnTTCCCC ACCTINGNAN GGGCCGGCCA 720 

Cm!!!^^ SYTGAAGGGG GAAACCCAAC TTTNTYTYYN AACCSCMNAA MYMTTTYCSG 780 

MNAASCCNKT CCCCTTTAAC CAMGGSGGTN AACCGKTMNG NGGKTAAAAA GGGSKNNICTG 840. 

GGGGGRAAAA TSTKTCNNCG GGGCCKAAAW ACCMMMMYGN GTGKKKNKSS 900 

NMMRAACTKN GGGGCCSSGA NNTTTNAAAG MSCCCCCSNN GSTGKCCCNN 96 0 

NTTTCCNNAA WMKKGKNWNM SNMNSCSNGG GKYNSGGSNN NNAAGMGGGG 1010 

(2) INFORMATION FOR SEQ ID NO: 3 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1010 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii; MOLECULr TYPE: Genomic DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:317: 

AATTCGGCAC GANGCGTGCC GCTNAACACC AGCCCGCGGC TGCCAGATAT CCCGGACTCG 60 

GTAGTGCCGC CGGTGGCGTC GTTGCTCTCC TGACGGGGCG CGGCGACCAT AAGGTCGCTM 120 

ATGCCCAGGT AGCGGCCCAG GTGCATGGAG TCGATGATGA TGCGACTCTC CAGCTCGCCG 18 0 

ACCGGGAGCT TGGCATCGGG CCTGATCAGC CAGGACGCGT AGGACAAGTC GATCGAATGC 240 

ATAGTGGCCT CCAGAGTGGC CGTGCAMTTC CNGCGTGCTC CACGGCAAAT GCCTTGATTT 300 

CTACTCCGCG TANTGTTCCC GCATCGCCTG CGGGATGAAT GGQAACCGCA SGATGGCGAC 360 

GAACGGGTCT GANCTCAGGT TTGCCGCTTT GCGCACAGTG GTCNACANCC GGTACTCGGC 420 

ATANATCTGG CCCNAAATCG GCGCCGACGG CGCCCACNAT AANAACGGGC ACNACAATCG 4 80 

CCGCCCCGGT CACCCNAACA ACANCTTGSC ATCGGATTTT GTCCCCANCG CTCAANCCGT 540 

CCCGAACGCC TCNTCCGGCG HACTTTTCTT NNAWTAACTG CCGCTTCCGK CCCTGGNGCA 600 

WTAAATGGGA AACCCTTNCC CCACCTTGAA GGGGTTGTTG NATTTTTACT GSTAACCCCG 660 

AA TTNTT CCG GANTCGGTCN KCCGGGSTTT YSTNTTCCCC ACCTTHGNAN GGGCCGGCCA 720 

AGSTTTTCTT SYTGAAGGGG GAAACCCAAC TTTNTYTYYN AACCSCMNAA MYMTTTYCSG 7 80 

MNAASCCNKT CCCCTTTAAC CAMGGSGGTN AACCGKTMNG NGGKTAAAAA GGGSKNNKTG 34 0 

NCCCCY MANG GGGGGRAAAA TSTKTCNNCG GGGCCKAAAW ACCMMMMYGN GTGKKKNKSS 900 

GCSAAATrrr NMMRAACTKN GGGGCCSSGA NNTTTNAAAG MSCCCCCSNN GSTGKCCCNN 960 

NTTTCCNNAA WMKKGKNWNM SNMNSCSNGG GKYNSGGSNN NNAAGMGGGG lOlO 

(2) INFORMATION FOR SEQ ID NO: 3 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IiENGTH: 1092 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

Cii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 18: 

NGiJGGGGWNS NTCAYCAYCA YCACSGGGYW CWATTGCGGC CGCAWCTTGT MAASAGATCT 60 

CGAAYTCGGC .^^MGAGGGAMT CKCTMGCNCC GCTGTGCAAN CCAATRAGGC CTRATAATTY 120 

CCACTCCACA AAAAACCGTT GTGTGTAYYT SCCGRAAATR AAGGCGCCGG TNTCAACWYC 180 

GCCGGTKTTY CCRATYCCCG TKTTGTAMCT GCCKGGGTSR AAAYCCCCGG TGTTGGAYCC 240 

CCGGATTGAA ACTGCCGGKT TGAAACTGCC GKTTTSGCSA TCCGGKWATT GAMSTCRCGG 3 00 

ATTAAAAAAC CGGKKTTGGN GCTGSNCGTG CCAAATNCGR AYCCRATAYC CCATGGCCTG 360 

KYCTYCTCCK VCGGTACCCA .WCTGGGTA TCCTATACTG GYCCCTAAAK GCAAWYCKGG 420 

GCTGYCMMTK TTGCKGGSGT CCNAATTTAS CACCASCGGT TCCTTCCATA CCNAAACNCG 480 

CKTGGGCWCC AGMCCGRAAA AAAKAATAAT RAKAAKGGTG CATNYCCAAA ACCNCCGCCN 540 

CCCNANTNCN ATCCGNTNCC MSCNCCCCCA GCGGTNAAGK TKSGGAAYTT CTMMAACCCC 600 

CAAANCCCCA TAACNTNCGR GAASAAACCC CTYCNCGGGG GYCNWNCAAA ACASCNTTAT 660 

TTGCTKSTTT CGGGMWCCGT GCCGCCNAAA YCCCAAASTA CTTTYTGGGT CCNAGAJCAAA 720 

ACCNCGGGCN CCMCCCSNAA NWTATYTCTT KGGCAANCCC CSAAACCTTR TCMNACCNCK 78 0 

ATRMTCCCTT CCC07SCAAT TGGYCGGRAT NCGSNCCYTY TCAAAKKKSC CAKWWNNGNG 840 

GRRNNACCMA ACCCCAAGTY CCMNAAAATN GKCCCCGCTC CNAACACGNK TYYTCCSAAA 900 

ASCCCWCCCC CCCCCCCRAA .\ACCCCCCNA RKANTNCCCA AAAACNYNGK GGCCCCCCCC 960 

CAAACMAAAA AMCCCCCSGM RMACSGGGGN NMCCCCGKKK KKTTTTCTTT TKCCMRSCCC 1020 

AAMGCAMWSY KSKTNMAAAA GGAAGRANCN TYCCSANANM TCCCNYWRSW CCGSWGMGNA 1080 
GAASMCCCCC CS 



1092 



(2) INFORMATION FOR SEQ ID NO: 319: 
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(i) SEQUENCS CHARACTERISTICS: 

(A) LENGTH: 12S1 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 19: 



GGGGGGGNNN NATACATCWT CYGTGYACCG GGGMTCTAKT GGCGGGCCGC AATCTNGTCA 60 

ASAGATCTCT NAMTTCGGGC ACAAAAACTW GACAAASYMT CGNGCNMTCC GTCTCCTNKA 120 

TCGCAAAACG NGTRACASAC ASACACRTAT GTGTGCCCAC CASCAAYTCK TTCGGACCTC 180 

GCTRACCGGY TGCCCRNACG CCACGYTGCS CWTCTATCCC RACGCCGGCC ACGGGYGGGG 240 

ATATTCCAGG CACCACGCCC AGTTTGGTGG ACAATGCCCT GGCAKTTTCC TCRAANTTCG 3 00 

TGAAACCGAA TTCNSMTTGA ACCNCCAARG CCCCSNCCNR AACARTTGGG OTCCGCGGTT 3 60 

CTCCCCACCG KTTTCCGGGG GTNTCGGCAN AANCGCACCC WTGGWTTCTM TCNCC3CACC 420 

GGGCGGACAA NTCGGGTTGC AATTTTGCRA AYCGGGGCCG GGATTCCSCA AACGGGTGCC ^8 0 

GAAACTGTTY YCRAAMACCG GGAKCCGCAA TTTCCGGGCR ANAAATTTCN YCNCACCACT S40 

GCTTRTACTT CCCCGACCGT AACMANTTTC ATCGTCNTNN CCTCTGCCCT TGGGGCAGGG 600 

CKAAAYACCG CMTTKGGTTT CGCAACCTGC GGCCCAANTC CCNAMCCRCA CTTTCNATTT 660 

GGNTCGAATT SCCCCCCGGT RANAACCSCC NTGGCCNNYT CGGASSAAAA NGGGCCCTNT 72 0 

KGGCNSCCCC AGTAANACCC TACCNNAYTS CAWTCTTTGC CAAASTTKGG ACGAANSKTG 780 

GGNTTCCGGK ATTTYYTTGS GGNCNCCCTN TATNGGSNTN GGGCCKCYNC NCSTKTGKCA 840 

NASSKAYCCS NGNKGGGGGT ACCCCCCTMG GGGGGTTTTT NSSGCCCCCC AWAYGNKSTG 900 

GCCCCCNNGG GGAAKAATWT MWWTMCNSGG GGGAAWTTTT NTSTGGAMCS SGGACYCCCR 960 

GGGGGKTT^. TCCCCCNCSA NNAWANGGGG GGGGGANAYT NTGNSGNGGG KWNTTTATTT 102 0 

YTYYCYCCTM TKACMSGGGG GTTTKKAKNG GGGGGAGAAA ANAAAAAAAA RAKGGYKNTT 108 0 

TSKNCACNCT GKWNWNWANR NAGAGKTCCT CKCKCCNCSG SNTTTCTTTT MGNSGSYGGG 114 0 

GNNGNNNAAA ACJJKSRMMAC KCSYTYCCCG CGYCTCCTCC NCNGGGGYGS NGSCGNSTYN 1200 

GNNKGRKWTA TNTMGNCGTN 3CCTCCJJCCC GCKNKNTGTC TMTCNMYGSG C 1251 

;2) INFORMATION rOR SEQ ID NO: 320: 

(i; SEQUENCE CHARACTERISTICS: 
(A) LiENGTH: 1099 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(il) MOLECULE TYPE: Genomic DNA 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 32 0: 

AAYTCGGCAC MGAGTATCAC CAAKCTGYGT GGCCCAGCAA AGTGGAGCTA TTACTACCTG 60 

TATGTGATCC TCRACATCTY CTCCCGCTAC KTGGTCGGGT GGATGGTGGC CTCGCKTGAK 12 0 

TCRAAGGTCT TGGCCRAAC3 GCTGATCGCG CAAACCCTTG CGCCCAGCAC ATCAKCGCCG 18 0 

AACAGCTGAC CTGCMCGCCG ACCGGGGGYC GNCAATAACT CCAAACCGGT GGCMCTGCTG 24 0 

CTGGCCNACY CCGTGTCCCA ANTCGAACTC ASCCSGCNMA CCAKMAACKA NAACCGTTGT 300 

CTGAAGCCCA GTTCAAAAAC CTCAAGTWCC GGCCCRACTT CCCGAAACGG TNCGAGTCKA 3 60 

TCRSAGGSGG CCGGGTGCMC TGCAACCGGT TCTTCGGNTG GTRCAMCCCN AAAMCAAGCA 42 0 

TTCCGGGMTC CGMMTGCCCA CGCCGCCAAS TTTMCTACGG GCSGSCCNAT CAAATTCGCC 48 0 

GGGAACSGSN CCMCCKTCNK GGAMACGCCC TWCCAAAACC CYCGAACGGK ATCCTTCKGY 54 0 

NAACNCCCGA RCNCCCKSKT TCCGGGCTTC NMSGCGAATA CCCKNSCMNT CCGAATCCAA 60 0 

..CCCMKYGG CTTTTYYYCC CCCCGGCCCC AAAYNGGGYC CCTASSNMKC KNCCAMNANT 660 
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720 
780 
840 
900 
960 
1020 
1080 
1099 



60 
120 
180 
240 
296 



CCNWATCTGG NGGTCCCNAU lOTGGCGrrC NMAATSAMNA NMNRGGGTYT TSCYACCMMN 
AACCGKNNKG KCCCCMKCTK MANAAAKATT RATCAMKWNG GGNKCKCNCN NAAMACCSCN 
CNCYNCWYTC TMYCSSKWGC GCSMYNANCA SNGGGGAGGW GGSGRMKMCT CTMTCTCNCT 
MGCGCCKNTN TYCKSGAKAT ACASMNKTCC GCGCNGCGCN MAAMANRAKA CTAKCCGYGN 
CCSNSTMTYN CTSNNMKMKN TCCWMWNATC NTYYGKKCNN KCTOKATNWC CSCTSKCTCK 
MRAMTCKTYG SNMTCCTCCA TCNCTCKKSC SNMSKNTCKC KSCNCCNCWN CNKCNMKCWN 
GGNSTCRCCY TCTMNNNTCS AGCKCGSKNC WACNCACACK NGWCTYTTCC WKNNMKCNKM 
TCKCKCACRG MTMTCWCCS 

(2) INFORMATION FOR SEQ ID NO : 321: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 296 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 321: 

GNGNTATACA TCWCTGTGYA CCSAGGATCW ANTGCGGCCG MAAKCTWSTM CASAGATCTC 
AAAYTCTGCA MGAGCGGCAC AKAKYSTCGT CCMRACCCGG CAYACWCCWG CNCGCCCCWT 
CTTRGACCGG GGCKATASMC ACCGTTGGCC CCGGCNCGCA CCTACACCAC CCACGCCGC- 
AGCGCCCCCW TRAMCAAACC ACCCCGCICrr TACCGCCCGC GCCGCCGGGG CCACCACCAG 
CCCCACCGGC ACCACCGGCG CCGCCGTTGC CAAAACAGGC CCGCKTTTGC CACCRA 

(2) INFORMATION FOR SEQ ID NO: 3 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1073 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 322: 

^JGNGSGNKMY ATCATCWTTC TGCACCSNGG MTCWATTGCG GCCGCAATCT TSTMNASAGA 
TCTCGAAYTC GGCAMGARCA TCTGCGCGGN GAATGTCCAA AWGTCWKTAA CGGCMATCGG 
TTTGCCGYCA ACCACKCTRT SCAKATGCGG GCCAMWTYCA AACCRATTAT TTGGGYCGAG 
-^AAATTTMCG CKTGTRASCA ACCTGCAGCG GGTCAASCAA CAGCCTCTRA ACCGTAAATY 
CICTAGGTNKT YCCGGCAACA ASCYCRATAA TSCGGCCCGC AMCCACAAAA CCTGANTOGT 
TNTTCNCRAA NCCGGTYCCC GRAGGGGTSA ACTGCSGTAR GCTTNTCWYC NCCTTRACAT 
TAAACCCCCC CGGNTCWTCG CCGCGCCCAA ATYCYTGCCC WTKGCNACCA YCCCANCCTG 
CSGTATGGTS RAANCASTSG GCRAACGGTM MCCSTACCKC TGGCTGATYC KTCGGNTCCS .ou 
3NAATTCGGG GATTTACGGS CAMGGTTAAY CCAGGYCCCC TNTGCYTCKY CNACAACCSG 54 0 
ATCMWCNCCG TACCTJCTTAA AATTCTTTGT GGTGGAACCC AWYCKAAAAA NMTNTYCCCN 
TCCAMMGGGG CYCGGAAKKT CNACNTGGICT NACCCCTNCC YTTGAASTTT TCYTGNCCCC 
GGCCCKAAAS ANACCSGAKC CCCGGAAYCS WTAGGCYTCN TGCCCCSTTA AAriKGNCYC 
.^TCCKCCAA CGCTCCCCGG GGTCSSCCMT TAAAMTTCCC CCCKSCASNG GAATYCYKSG 
GCWGTMATTW CCNCCCNTTT CYYGKNAAAC SCCCCCWKGN GSCTYCCCCN SNTTSSGCCS 84 0 

GGTTSGAMYC AAAAWTNGGG MMCNRAGNC3 SGNAMCCSCN GKKGGGSATW TKAAYYCYGG 90 0 
3GGGGTCNYC CCCCRCSNAA AAGYGTKGGC KCCSSSCCYC CCMARTTrYT CNGGMRCMAM 96 0 



50 
120 
180 
240 
300 
360 
420 
480 



600 
660 
720 
780 
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ACCANGGGNG CTCCCGTNCW WGGCTCCCSN SNSMAMAAAN NKCKCCKGGS CKGARRNMNA 102 0 
MCTCSNGNGG WTCCCKNKTC NSCNSGNCGS YGGNSASWCC YNYCNCCACA ANC 1073 

(2) INFORMATION FOR SEQ ID NO: 32 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1166 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:323: 



CGCCCCGTTC 


TTMMMTTCAY 


TCATTCACCG 


GGMTCTAGTG 


CGGCCGCAAK 


CTTGTC:<ACA 


60 


GATCTCGAAY 


TCGGCAMGAS 


ACAATSTCGG 


GTKGGGCAAT 


GTCNGGTGGG 


GCAACTTTGG 


120 


3CTC3GRAAT 


YCGGGGTTAA 


CGCCGGGTCT 


RATGGGTSTG 


GGTAATATCG 


GGTTTGGTAA 


IBO 


TGCCGGCAGC 


TACAATTTCG 


GTTTGGCAAA 


ATATGGGTGT 


GGGCAATATN 


GGGTYCGCTA 


240 


ACACCGSCAS 


TGGRAATTYC 


GGTATTSGGT 


NACCGGTRAY 


AAYCTGACCG 


GGTNCGGTGG 


300 


TTYCAATACC 


GGTAACGGGA 


ATGTSGGTTS 


YYYACYCCGS 


GSAACGGNWW 


YTTNGKTCCT 


360 


TMMCNCTSSM 


CCKSAAMTSM 


KMGGTSTYCT 


MTYCNNGGAS 


TAMTYNMCCC 


CCGWAYCKSC 


420 


WAYCCCTCGT 


CATYCCMCMC 


SGSGYCCTCA 


MNCCACCYTG 


NGYYCCCTCC 


MKMTCYCAYT 


480 


CMNTCCGGTW 


CCTNTMMNCC 


CSCNCRYCTC 


AMCNCTKSGK 


CACCNATMYC 


CSACKCHTCT 


540 


MCYMCSCAKN 


MTTCCCCTCN 


CCTYTNNCCA 


MCMCSCTCTM 


TCMAACTCKC 


CCGGYCKCNC 


600 


MYCTCTCXCC 


AYNMAACCKK 


TYCYWCNWYC 


YMYCKCKCAG 


WYKNMCTCCW 


ACTCTMYNTT 


660 


TCTCTCNKCC 


CMKACCKNTT 


CTCWCSCCCC 


CCACAKAYMC 


YAWCMTMTCC 


MCTCKACSCC 


720 


CYYCNNYCCM 


NMCWCMTCWC 


TWNAKCANCN 


TTCTTCTCTC 


iMMYMTMACKC 


WCNNTCNCCK 


780 


SGACCYTCTC 


ACTKMKCCKM 


TCTCCTTMCK 


CCYMWCNTCC 


MKYNCCCTCC 


NMTCMTCKYT 


840 


CCTCNCNMRY 


CYYTAKCAKC 


NMCTCCCCAN 


KMCAKCTKCT 


CCCCCAKMKS 


ACNCKCCCWC 


900 


^ ^ rp J^rp ^ 


WCTCTCWCTY 


ATCTCKCTCW 


CNYCMYMKMC 


ACNCKCYAYT 


CNACTMNMWN 


960 


CCANCNCrCT 


ct:tyctcv/c:< 


ACGT^/CKCCK 


CTMCKCNYMC 


NRWCTYRCCT 


CKKCCNCCRN 


1020 


CKNMCMKCTM 




TCCCWCCCAT 


CTMMKSTCTC 


WOJCMTCCCT 


CNKCCYNYNT 


1080 


XCYT Y C CMYG 


CTTCXNTCMT 


MCCWCCYATC 


TCTMKCCTCT 


CWCACYMCAC 


WMTTACWNCC 


1140 


ACTCTCTRCW 


CKCCKCMCCR 


OTCTC3 








1166 



(2) INFORMATION FOR SEQ ID NO: 324: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1230 base pairs 

(B) TYPE: nucleic acid 
(C; STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 24: 

NGNGGNNNNT CWTACATCWN TCTNCACCSG NGMTCWATTG CGCGCCGCAW NCTTGTMNAS 6 0 

AGAATCTCNN AAYTCGGCAC ANATGTCTTT TSTMTAKTGT GGCGGGGNGC CACGCCKTAT 12 0 

GTGYGCCTGG GYTRACCCAA CCCCGCGGCS CGGGCCRACC AGGCGGGGRA TSCAGGCCGC 180 

GGCGGCCGCG GCGGYTATAT RAAGCGCCGY TTTTKTRATA ACGGTSCCGC CGCCGGGTRA 24 0 

TTACGGGCAA AAYCGGKKTT TTGGGTRTAT AACGCTAATT GCAACCAWTT TTTYCGGGTC 3 00 

AAAAACYCGG CGWGCANATC NCGGGYCNCT RAGGCGCATT YMC3CCAAAA WTNTGGGCGC 3 60 

AAAACCCCKT TSYTATTTTN TGGGCTATSC GGYTGCTTCG GCAAACGCTY CCCGGGTTAA 420 
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TCCCKTCCGC GGCGCCGCCN AAAAACCACC AATYCCGYTG GGGGTGKYCC CMCAGGCSGT 480 

TGCTYCGNGY CACCTGGCCA AAYYCCCAWT AKATTGGGTG SCYCKTSCGG TTSYTGGGCY 54 0 

CAATTACCCC CNCGGGNAAA GRRAAAANAA ATCNTCCNTT TGCTCGGYCA YCTTTMTTGG 600 

SAAAAGGGGC ATGGCSCGGT TYYTTTACCT CAAYCCCCNA NCANTWACCT YTCCSCCCGG 660 

GGGGNCANAA CGSTTNGCTC CGSGGNAKCC TKGTMCCCGN ATCNAAAGGC CNGAATTTGG 72 0 

TYYSSTYCNA ATTV/TWIOCKY CCCCWCNTTG YAAAAAKCCA AAASAKCCCK YCNCAMMYKT 78 0 

NGGGGTYSSG GCCKNYCTTK SNMTTAAACC CYCCCCAAAA YYNSGGGKKT TCCGCYNSAT 84 0 

KCCACCNCCK GNGGGGGGNA SAAAAAAAAY TTTYCCSAAA ATCCCACCYY TCYKTKSTRY 900 

AMACCCCCTT TYYMKKAYTC CKYSCNATTC SGMTTCWAAA TYCCGYGGCT TNTTCCCCCK 960 

CSGGNGCCCC AAWTTTGKTT YNCNANTTYC CCCNAAMNCM AWTMGGGGKS KCCATTCTGG 1020 

SCYTMAANTA AAANAANGGG NKTTTYYCTY MANAAACACN GTGKCNCNCN CNAAMAAASN 1080 

AKMAAAKAGN KKKMTKNNSA AANCCNCCCC CTSTYTNYTT WKTNMNCKCC CYGGKKNKGM 1140 

SWSVmJTTCT NCCCRCCCCC YNYNKTGANA AAMMNCYCCS GGSTMCRNAN ASNMNTTTCK 1200 

STSTNGMGCC KMBASNANAN MCAMWKWYCC 123 0 

(2) lOTORMATION FOR SEQ ID NO: 325: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1022 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:325: 

NGNGGGKNNA TMAYCWTCTC ACSSGGTCTA TGCGGCGCAW CTMGTMAASA GATCTCNAAY 60 

TCGGCAMNAN GCATMTCMMC CATATATAAC CATTGCGTCS GYWTGCAWCT CRAAWCTGTC 120 

CTTCSKGCCG TTKTACRAAG GTGGMWTGYT CWTYCCTRAA SCCCTCRATC TCKTKTATYC 18 0 

CTKGGGCTYC ACTTTAACSG RATKSCTGCC TTKTAYCATT RATGCAAWTA WTGGYCRAWT 24 0 

KTTGCAGGCC RACGGCWYCT TTTYCCGCRA GRACAATNGA TTGGAWYCGC TYCGCRAGGC 3 00 

CCGGCACCAR ACCGGGCNCC AAAGGYCCGC GCAAWTSCCT GGKTCAAAAA TGGTGCAAAC 3 60 

.\AAMCNATCC CCGGYTTRAC CGCAGYTAMC ACAAKAAAAT TCCCWTGGCC GCACCAWNNT 42 0 

TTYCRATCT^Y CWYCCCCACC TTRAACTTGK YTGCSGTATT GCCTKCCTGC CTCRACAGCM 48 0 

YCNCCCKTCA AACCTGCGGT GACTCCAACT GGTCTGGYCG AASGGGGGYT CAMCGGACAA 54 0 

-\ACCCCRANN TCGCCAAATT TTCNCCCCCC CYCGGGAAAN GKTGATMTTC TCSNAACCSA SOO 

CiMGGGNNYTW NAACCCTGAA CSSSGSNKGA MYNSCCSGGA ANTTTTCCCT TYNGGGCGRN 66 0 

AAANCCTTTT AAGGTACCCC KGGNGGGGKG CCCYYTTGGG AAAACAACCC CXATTGGKTT 720 

TGGAAATNTT TKCNCCCCCA TTCNSGGGGG GGGCCCCAMC CCMMCTTTTN TCMSCNMTYY 780 

iCYYGGGAAT TNYTCGCCSG GAAYYCGGSM CCKGYCCTAA NCCCCMNWGG GKYSTGSNAR 840 

GGRATMAWWT TYSTTTYYMC CCGGCNNCCC CCCKAKMCNT KGNTGAACMA .^U^AKCSGGGG 900 

GSCNMYMWYY YCNNNGNRTT TNRGGSSNMT TYMAAAMMAN GGGGKYWTYY CKCCNGSCNN 960 

GKTYSGGGST TTTCCNTTTS GGGSSATYKG MACCCCKTMT AYCCGGGGGT NTKTKYCCCC 1020 

SC 1022 

(2) INFORMATION FOR SEQ ID NO: 32 6: 

;i; SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1083 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: sxngie 

(D) TOPOLOGY: linear 

lii) MOLECULE TYPE: Genomic DNA 
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(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 326: 

NNCGNKGKNTA TAMAYCVTYCT NCACCSGGGA TCWATTGCGG CCGCAATCTT STMAASAGAT 6 0 

CTCKAAYTCG GCAMGANCCG CAWCTATTTG KGTGRASCGC ACCAGCGRGA CCTCGCSGKT 120 

CKTTYCTTGC AGRGAGGCCK TGGGTGGCRC CSGTGGCAAT GCCAACCGCC CCCCAAAACN 180 

CCGCAAATMY CRAAAAACAA CCCSGGGGTA GKTCCSGGCC GCCAAATMAA TAACCGTKTT 240 

AACKCAGGCN ACGGCCAACC GGYCCCGCCC AACCAAGCNA CCTCCCCSCC NATAGGYCCG 3 00 

GTGGGGGCTG CCKTATYKCC AASTCGTCAY CTCNACGGGM CGGYCCMCWT TCCGCCTCAT 360 

CCGTCTCTCC TTMMATTTTC CRTCCACYKG GCGGGGAACY TTTTTNYCNC CCTTGSCMAN 420 

CACCNAAGGY CNAAAATTNC CCMTGCCKYG SNNCAAAYGR GATTGGGGTY CGKKTTTTNT 490 

TCNMCCMAAC CCCCNTTTNA CGCCCCMATC CCYTWATACC CCCWWMCMNS ANGKTTGNSA 54 0 

AAKTNNCCCC AAATRCCAAA MTTCTTCGCC NTTTMTWMCY YYCCTTTCCC CMCCCWNAAA 600 

GGSCCRCCYY TCGGGAANTY TCCCCNCAAA AWTCAMWCCM TTTCCCNCCA AGAAWTTCSG 66 0 

SACTCCTTTN TTCNGGGNAM ATANATYYTT YCKTNGGGSK TTCCGMTCNC AMMAATNTCC 720 

RGGGKAAMCC AGKNTNOTCC YYYYCCCCAA NNTYCCYKGG RMCYNNYYCY TTAAANRASR 790 

SAACCCICSGG GKCYNCNCSS TARCCCCCAK KAAAATTTCC CCCSSKTTTC TYY^JNXKMRW 940 

GCCCCCSAAM ACTMTWAYTT TCCCKCGNNN TTTSYCCKCS KCAMWMWMTG KKNCTTTTTT 900 

YCSCMATAMA CTTNGGKCCT NTCNYGSGCG CMAAANAAGG CGCGSTTCTN TTCWMAMACA 960 

YNTSGNMMMA SAAKAKWATA AWNNTRJOCYK TKNNCCOJCC CKCKCTTSNN TNKCCMCSKS 1020 

GGGKNWNKKR GWCTCCWCNC CKCCCNCKNK CCKWATMCCC CCCCSKCCGM NCMKt3TTTKT 1090 
CCC 1083 

(2) INFORMATION FOR SEQ ID NO: 32 7: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1069 BASE PAIR5 

(B) TYPE: NUCLEIC acid 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(II) MOLECULE TYPE: GBNOWIC DNA 

(XI) SEQOENCE DESCRIPTION: SEQ ID NO: 327: 

GGGGNNKYAT MCAYCWTCTS YACSGGGHNC TATTGCGGCC GCAWYTNGTM GASAGATCTC €0 

GAAYTCGGCA MGAAAAAAGW GATGTGCTGG ACCTTMCCGC GCGGGACGCR ACCRACAAAG 12 0 

RAASC6CGCC ANAATATTGG CCACAKTTGG TCACATATTT ACCCAATTKT AYCAGGGAYT 18 0 

MCCATTCCKG GGACCRACCG CACAATCCCR ATSKTGGTTT GCRAACCCTR ACCGTCCCCA 24 0 

KYTYCGCCRA STTGAACCAG GGCRAAAAAA CGGCCRAAWY CTCGCCCTGA NTCCCGCTCS 3 00 

GCGCNAATAA CTAGGCCCAT TKAACGGAAC CGGNGGCCSC NANTTGGCCA ACAGGTCCTR 360 

ACAAAGGGGC CCCASYYCGG CCGGWTCCCW TTYCACNCCC TNKTCTCKTG CCGAATYCGG 42 0 

WTCCRATNYC CCWTGGGCCT TKTCKYCKYC KYCGGTNCCA AWTCmGGTA TNCTATRGKG 4 BO 

TCCCCTAAAT SCANATCTGG GCKYCCATTT NCTGGSNTTC NATTTAMMAN SRRCGGTTCT 54 0 

TTCWTTCCRA AACCGSNTGG GCCCNNMCCA AAAAATGATN ATAATAATGK YGSCTTTCAA 6 00 

ACGCCGCCCC CCCATTCRWT CSGTTCCANC CCCCNGNGGT TAAGKTGGGA ATTTYTNAMC 66 0 

YCNARGCCCT KATTTSGGNA AAAACCYCYC GGGYCTCAAA CMNrrTTTTT GSKSSNTCGG 72 0 

GCTCRTTCSC CAAAACCCAA ATTNTYNYGG GGYCCKTNAA ACMCGGYCRC RCCGGAAATT 730 

TTTYTGGTTC AACCCCAACC TTTTCAASCC NTTTTYTYYT TRCCSSCSMN TNGSSGGGNT 840 

KSSCCNTTCY RARKKCCNMN GGGGGWYCYN CCCCRMNTTT CTTTTTTTTT CCGTNNMAAM 900 

NGKTTCTTCA AASMCCCCCC SCCCCCNSAA ACCCCCTNAR GTTTTYCMMA AANNWYNNGN 960 

KNCCCCCCCC MKNAAAAAAY YCSCCCGNRN ACSMSNGGGA MCCCCCGGSN NTTRKTTTTT 1020 
TNCMSGYCCC CSRMASYYTT TXAMAMANRR GAMNSMTTTY TNNRGNWNK 1069 

(2) INFORMATION FOR SEQ ID NO: 328: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1210 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 328: 

NGNGGGGKWK MATACATCWT TCTTCACGSG GGATCWATTG CGGGCCGCAW TCTNGTMCAA 60 

SAGATCTCGA TYTCGGGCAM NACCCACCWC TCCRAAAAAA ACCCRAAWCT CGGGSKCTYC 12 0 

GARAAGTGTT GCCCGCKTTR AATTTAACAA ATTCAGTGTC ANAGTGTCAC GGCKTTACWT 180 

YCCCGGCAAA GGGGCCACAA CCTGCAGRGA SCACYCRATG GKTGYTGKTS CNCGGGCGGG 24 0 

CCGGKTNAAG GGACCTGCCT GGGTKTGCSC TMCAAANATC WYCCGCGGGT YCGCTGGRAT 3 00 

MCNCAGGGGT GTCAAAAAAC CGCAAACAGG CACSCCANCC NTTTACGGGS CTTAAAANGA 360 

AAAAGGGCTG ATGCCCCCAA GGGGGCCCGC NCCCAACCTT CCGTTGGTCA ACAACCCGGT 420 

CTCTCKTGCC RAATCCGRWT CCRATNYCNC CWTGGCCTTK TCKYCTYCTY CGGTACCCAA 480 

ATCTGGGTAT CCTATASTGT CCCCTAAWTT CCAAATCTGG GCTGTCCATT TSCTTGGCNT 54 0 

TCCAAATTTA CCANCAACGG TTTCTTNCAT NCCAAAAACC GNTKGGCKCC NRACCCRAAA 60 0 

AAATGAATAA TAATAANNGG KCNNTTYCNA ACCNCCCCCC CCCNATTCCA TYSNGTTCCA 66 0 

NMNCCCCCAG NGGKTAGGTK GGGAAANYYC TCMACCYYCA ANCCCTWARS TTTTNGRAAT 72 0 

KAAACCCTYC YCNGGGTCWW TYMAAAAAMA NTTATTTGGN NGNTTTCGGG MWNCKRKNST 78 0 

SCCAAAATCC MAAATANTTT YYTGGTYCNA TWAAAAAMCG YGNCCMNCCC GGAAAAWTTT 84 0 

TTNTGKTTSA ACCCCAAAAC YTTTTCMNAA NCSSKTTTTY CYTTCCCCCC AMNWTGGGYS 90 0 

GGGNATKGYG SCYTNTCTTA TKTKYTYMTW CMGGGGGGNN MKMTCMMCCC CCMTTTYYCY 960 

NYWRTTTTTN KCCCCKTNMR NNRAANNGGN YTCSYNANAA AAGCNCCCCC SCCKNCCCNA 102 0 

AAAAWCCCCN NNNARAKTNT TTMKANNRMN SCKCNKNGKY YCCCCCCCWC YNMNNAAAAA 108 0 

AATMYCCNCC RASANMCASM NMGGRGNRSC CCCCCCCSTT NNNNTMTTNT TTTTTTCSRA 114 0 

GAGCKCCSCG MNNANMKNCK CTTTTTKCNC NNGNNGNGNN GGNGMNCKCC CCNAGAAMWK 1200 

CTKSTCCCKS 1210 

(2) INFORMATION FOR SEQ ID MO: 3 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1105 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

iii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 329: 

^GSSSNGNNA TMCATCWYCT GYACSGGGMT CWATTGCGGC CGCAACTNGT MAASAGATCT 60 

CGAAYTCGGC AAKANACACC ACCGCCGTGT MTATACACCG CAAATGTTCT GTKTGCCAAA 12 0 

ACCGAGACGC GCCGGCCGCG GGGYTCCAAC GCKTTACYTR ACCCGCCAGY TCAGTGTTRA 180 

AACCGGTGYT RAGGGCCGCA CCCAACWTAA ACGCTTTAKC CAAGRAWYTG GKTGGCCCGC 240 

AGCCACCTGY TGTGGYTGCC CTCWYCGGTG GTAGCGCCGG TTANCGCCGG TTGCGCGYTC 3 00 

AMCASCSCGC CGGTRATCCC AJCCNWTCCCC CGGCCMRACC CACCGGGCAC TTTGRACGGT 3 60 

GCCGCCAATT CAAAYCKYCT GRWTCCTTCM AAACACCACR AAGGCCACCM CCMSCACCNA 420 

ATMGGGRACT TTAAGGCCCA GGCAAAACCT NTRAKCNCCT CCCGGGCRAA GGTCCSGCAA 4 80 

3CRATCCMAA AAAAKCKNAT TTCCCCCAGC AKCAACCCAA MMCGSTTTGC TGCTTCCGGA 54 0 

TTCGAAMCCA ATTMCWGGKT NCNWGGGAAA AACASCNNCC NWTAKCCMGG CCCMCGGGCA 500 
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ATTTCSGRAA SAACCCCTNY CCCGGGTTTT YCCTGCTCMG GCCCAANACC CCCGGGAATC 660 

AAAAASGGTC GGNCAAANGG GCMAAACCCS SACCCMACTT WTTCCRCTTN GGGGGGSCWN 720 

CCKNGTTTAA AWKSCCTCYY CTSCCCAAAY TCGGKCMAAA NNGRKTTGGK TTNGGCNACC 780 

NTTTCCGGKC CCGGGKGKGK WCKYCTMNMA CSTTTNTTTT SCCCCYKAAA MYSCCCCCCC 840 

CGGSSCCCCG CCCGGGGGGA NNTTTTTAMA GKKTYCCCCT CCCCAMAAAA ANACCCCNYC 900 

CCSGGSCCCT TTKRWAAAMN KCTSCCCCNG GNNGGGGKCM GGKTTATTMT NNNCCSCCCC 960 

TCCGCGSAAA AAATAKMTTT SYCCCCCCNC CTCCKNCKNR GKAMSMSCGC TCCCYCTCNC 1020 

GCNKNTWAAN ARSNCCKKNN CCNCYKCCGS NSNGKCNWCD NCCSTSSNCT NKGCNCKNCN 1080 

KAAANAAYNC NGSMSTSSMN CNKCC 1105 

(2) INFORMATION FOR SEQ ID NO: 33 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 6 base pairs 

(B) TYPE: nucleic acid 
iC) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33 0: 



NGSNSNKNNN 


TAMAYCWYYC 


TSCACSNGGA 


ACWANTGCGG 


CCRMAWCTNS 


TMKASAGATC 


60 


TMGAAYTCGG 


CAAGAGCGGC 


AAGAGTGTGT 


GCATCTGGTC 


ANAGTSTMMA 


CRCGGTGCCG 


120 


CSGGTGKGTR 


GASCACMCAT 


NTGCGRACAC 


CAAACCCKTC 


GCGGGYCACC 


GGCKTCGCCT 


lao 


GCAAAWYCCT 


CCAGGCCACC 


TCRAACAAYW 


YCTYCTGCAA 


CGCARGCCGT 


TYCGCGGCCG 


240 


RATCCTGGKT 


CASYYCGCCK 


TGCGGTGCCC 


AAGKTACTGG 


CSCAYCAAAA 


CCGCTCCGGG 


300 


RAACRAACKT 


AAWTYTGCCG 


AATTTCNTTC 


CCCTGCGCCT 


tgataaattt 


NTNAAGCCAC 


360 


CGCAAMCCTY 


CGGGCKTCTC 


CTCKTGCCRA 


ATYCGRWTCC 


RATAYCGCCA 


TGGCCTNKTC 


420 


KYCTYCKYCS 


GTACCCAAAT 


CTTGGGTATC 


CTATANTKYC 


CCWAAANRCA 


AWTCTGGGCX 


480 


KTCCATKTSC 


TGGSKTCCRA 


ATTTAMMACA 


NCGGTTTCTT 


TCWTACCAAA 


AACCSNTGGG 


540 


CCCCRACCRA 


AAAAKGATAA 


TAATAAKGTG 


CWWWCAAAAC 


CCZGCCZCCC 


RRTTCAAYCG 


oOO 


GTCCARCACC 


CCANGNGGTN 


AGGTNGGAAT 


TYTMAACCCC 


CAGCCCATAA 


SNTTNSGNAA 


660 


AAACCCCCCN 


GGGYMYCAAA 




GGGMTTCSGS 


CCATKGYKCC 


AAAACCAAAA 


720 


TMTTTCYGGT 


CRWAAAAACC 




NAAATTTTTT 


GKCAACCCCA 


AACCTTTMAM 


780 


CCNNNTTCYY 


YCCCNSACAA 


TNGGSGGNKN 


NGSSCNTTYT 


TWTTTYYNNA 


GGGGGGRRWC 


840 


3NCCCCNAAN 


'jTYCCNAANKG 


NKCCC3SNMA 


AAAGAGANTT 


YCMKAAAAAC 


CCCCNCNCCC 


900 


NAAAYACCCC 


MAAAKWTTCM 


AAASMSCNNG 


YCCCCC 






936 



(2) INFORMATION FOR SEQ ID NO: 331: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1042 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 331: 

NNNGNKNNNY ATMMAYTCWY YCTSCACCSG GGNNWCWATT GCGGCCRMAW KCTTGTMAAS 60 

AGATCTMNAA YTCGGCACAG ASSSGCACAG ASCCGCGGCG CTATYCMYCC GYTGCTCATG 12 0 

CTCAACACGC TCKTCGGCGW GRATAATGGC NC3CCGCCGG CGCCAACACG YTCAAYTGCT 19 0 

TCGCCAACGC CATATNTCAA CAAGGTRATA AAASCAAAAC CGCSCGCCGY GCCCTTGGGC 24 0 
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GGGCACYCGG 


KTSRACTTTA 


AASGGTAATC 




CCTSYTGGCG 


WGGGTCTGGC 


CCTGGGYCAC 


J 6 0 


TKGGTYCAAC 


CAACCCACTT 


CACMAAATTG 


420 


TAATAATCSG 


NTGKTCSGCC 


MYCACCGGWA 


d. B ft 


SAAATCATYT 


CCTTCTGRAC 


CCCCACAMRC 


540 


NTCYCTCTCN 


GTRCCCAATY 


TGGTTTCTAT 


600 


YGSTTCCAAN 


TTNACAAMAS 


GGTTTYTCMT 


660 


RAAAANAKGG 


KCTTTYAAAC 


CCCCCCCTAT 


720 


GAAAYTTHRA 


CCCAANCCMT 


ARSTTSGNAK 


780 


CTTCGGMCTT 


YCCAAATMSA AAATYYTCKK 


840 


NAAMCCCKMA 


YYTRTTWMCC 


WTTTTCCYCC 


900 


MCRNNSGACN 


CCCCMNTYTT 


TWTTCKCWCN 


960 


MTCCNCAAAK 


NTTTNAACNN 


NNKYCKCCCC 


1020 
1042 



CCATANCCTG GCCGGCSCTG GCAAATTTCC 
CTNSAAATCC GRATCAATNC CCCNKGGCTT 
RKTNCCCYAA TSCAATTGGS TTYCCRTTSC 
ACCAAAACCC NTGGSCCNNA CMNAAAAKNA 
TCAWYCGGTN CMRNWCCCCG NGKAAGGKGN 
AAACCCYYCG GGGTSMCAAA MKNTWTTSSC 
KRMNAAAAMC YGNCCCCSAA ANATTTTTGT 
CCMCNNSNSG GNTNCCCTTY TYATTTCYMM 
MMARGSNNYT RGRMMNMNCC CCNCCCCNAK 
CCCMWMNKNC CCCCMNCMTT TM 

(2) INFORMATION FOR SEQ ID NO: 332: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 73 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLSCJLS TYPE: Genomic DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:332: 

ATAMATCWCT CTSYACCSNG GMTCWATTGC GGCCGMAWTC TOGTMAASAG 
ATCTC.AAYT CGGCAAANAK ACGCMAYGTC AAGTGTRAYY CGGTCACATA TCMTCGCGNG 
.CAACMCCAA AGCCGNGTCA CCGYCTCCCT GGGGCGCCAC CCCCATCGGT RATGCAACYT 
..CGCGCCAC CGYCAAAAGG CTCWTTRAGG CGCTAAAGGT CAMCAATTCC TRAGGTVMCN 
TGGCCCGCCC RAWTYCTRAC CCGCAATWTC GGTAATCGGR AATTTGGGCT 
/CGoCTTGGG CAATAAGICTN TTGGGCAACG GCGGRWTCYC NCTGGCCGRA ATTCCCNCAT 3 60 
^PmI^^^w n^f TTYCCCGGYT GCCGTAAYTG YTYCNTGGGC GCCYTCGGCC 420 

:^^?^r;^ 5^^-^^^^' CMCCAGGCAA TACCKTTGGC riTRAACCAC CGGRATNAAY 
VTCAASSGTS CTGRANTTRK TNTCNTGRAA AANMCCACCN AACCCGGNTT 
^r™"" MTC^NCWTrr SCCGGGTTCT GCCGITTTGR AAYCTHIATC CMTYCAAAAG 
CCAANRAATT CGG^riTGCCA CCTTGGCCGS GGCTGGTTTM CGMWCCTTOR 
AMATCC-TCCo GCGGGSAAAN .WTSGGNTT SGSCCGGTCC CCCGNAATAT YCNTGGNCCT 
^^^I;^^' GGGATCCCCN GSGNAYCCGG CCWTKGGGGK TNCCCAGTTG GWACAATTvc 
^ v^:^ AACCCGGGNC CGGGGGGTGG GSCCCNTTTT CCTMYNNAAA AAGKCTTTGN 
-^nfYTTTTCCG CNRAANTTCA CCSKCNIOTTT GGNCCNAACY YYYCAANTTC CANACCrrTA 
AASAAANCYK YGKTYYCCCC TTTTMCCSGS SANCCCCCCM NMSSKNCGGG AAAAAAAGNK 
TYNGCCTTAN CNSNKTKTTT TNKTYCCCCC NMWNNSNMCY NCBKKCNKRY NGNSNMNCCT 
MKYSrCCNNNN SNNNNNKCGN GSNCSGMKYM CMNNCNGMYK NGNKSNNCCC MSC 



(2) INFORMATION FOR SEQ ID NO:333; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: lOSi base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



60 
120 
180 
240 
300 



480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1073 



!ii) MOLEC'JLE TYPE: Gencmic DNA 
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(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 3 33 



GNSNGNKNTO TMCAYCWYCT SCACSC3GGTC TATTGCtSGCC GCAATYTOGT CKASAGATCT 
CGATYTCGGC AMNANAARTG TCGTCGTCAA TTTCAGKKTG GTCKTC^V S^SSS 
^GACCRACA CCCTGNGTCA CCCAAAANAC CAACAGCWTC AaS^^ SSS^sc 
TRTCAATYCC CRASCATTTA ACCGTKTCCW TCRAAGGTGC CRAACcSgC AcSJ^ 
CCGCCSGGCA AWTCGCGCTG CCGGCCGGTN TCAGCCTGAT TYCTGAcS rS^S^SS 
TGGYCAMCNT GGTGAAGGCC CWWCCGCCNA AGAACTGGAG GgSSSS SScS 
GRAACCCNAG GAACCCGCGG TAKAANCCGG CRAAACCRAG GcSyTGgS SSS^^ 
NAMSGGTITG CRACKTGGCC RAACCGTTrY CTTGGTCGGC CTCGgS^S SSSSS 
C^^^^^r CYCGGG^CT I.3KYCCCAAT NTGCYCCCGC SS^SS 

^S^C ZTr-C^r^. AATTCCCYTG GTTAATCACC GGGCNCnS^ 

GGTITIGGGC AACCCCNCYS CTOITTTAAA CATTCCGSCC CAAATCGGNC STTGGSAAAT 
TCTOTYCGGT GGGGCSGGCR ANMYTTCTCT YCCCNAASAN CTTAMyS ^S^l 
CGGKCAAAWS NGGGGGGGNA AAGGGCCCCC CGGNTSCKCC GGGgSgS^ SggSS 
AANTTTCSGG GICTSTMSCGG JmCSCCCCC CSGCCAAGRA CCGnStS^ ^^^^^^ 

CcSSSi T^S'cSr 

^n^n^^Z CNCCSGKKGT CCMTSTTTMM MRCCOTTGN GNKI TAN 960 

rirS^ CACCCCCYCK GGGKCSMNNA GAAKTMYWKC CNGGGGNNAN RScSS 102 

GSGKGGGGKG MGAGYSCCKT CTKGCGNCNN YKWTTTCCCC C KSCC..CCNN 1020 



60 
120 
ISO 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



1061 



(2) INFORMATION FOR SEQ ID NO: 3 34: 

[i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 986 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 334: 

C^^^a C?S"''"° OMTC^ATTGC GGCCGCAWKY I^GTMAASAG 

ggScSJS Jt^S^ cggcacagag tgtgtgcatc tgtgtcanag ctgtcaacgc 
Svr^^ cmcattgcgr aacaccaaac ccgtccgcgg gycaccggcx 

TC.CC.GCAA AAYCCTCCAG GCCACCYCRA AACAAYWYCT CCTGCAACSC ARSCCGTTYC 

cc^cSSI acSS^g^ f ^^'"'"^^ "^^^^^'^^^^ --^s ^SSI^ 

TG^^SSS ?gS^™ AAI^GCNTT CCCCCTSCCC TTRAl^AATT 

TGG^^^ SS^^^ CGGGCICrCTC CTCKTGCCRA WTCCGRWTCC RATNYCGCCA 
ISSSg ?c^^rr' CTTGGTATCC TATATTGTCC CTAAATGCAA 

CcS^SSr rlT^T TTWAMANCAG NGGTTrCTTY CTTCCNAAAC 

SJJ^ScC AATGATOATA ATAATGGTGC TNTCAAACCC CGCNCCCAT^ 

CNATCSGKCC AMMCCCCRGN GGKTANKKGG GNAATTCTMM AACCCCAAGC CATAASNTTV 
C^^rrl ^"""'^ CCAAAACANY NTTNrTGGNY ^mr^Z TZ^^ 
^cS^ C3GYCCAATAA AAMMMSGGYC SAMCCGGAAA wSSSS^ 

3^^^ CNAACCCDAN WNTYCCTNCC RCRCMANTGG CNSGGARTKT 840 

gSSc^ ^C^GGGRANA CCARCCCCAA TTCCTONN™ KNKNCCCNST 900 

"Z^clTc ^"^^^^^^ 



60 
120 
180 



360 
420 
480 
540 
600 
660 
720 
780 



960 
986 



(2) INFORMATION FOR SEQ ID NO: 3 35 



(i) SEQUENCE CHARACTERISTICS 
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(A) LENGTH: 10 74 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:335: 



NGNGGGNKRN ATMMAYCWCT SATYYACCSN GGMNMWATTG CGGCCRMAWT CTNGTMKASA 60 

GATCTMGAAA YTCGGCAAAG AGYATKCTCG GGGGCCAGAT TTNTGGCCCG CAACCGCCGC 120 

ACTTTGCAYW TCAACAKTCC SGGTGCCCCA AAAAAWTCWT ACCCCCATMC TYCKTGCASM 180 

ASYTGCGCCC RATTRAACAC CCGGCCGGCW TGCTGCGCCA GGTATTYCAS CAGYTCAAAY 240 

YCTTTKTAGK TAAAATCCAG CSGGCGGCCA CNCAGCCGGG CGGTKTAGGT GCCTYCRTCA 3 00 

ATMACCAGCY CGCCCAGGGY CACCTTGCCC AAAAYCTCCT GGGTCAGCCA AATTYCCGCS 3 60 

CCGGCCAACM ACCANCCGCA TYCTGGCNTC AATCYCACCG GGCCCGGTGY TAAAMMANMA 420 

GRATCTCKTC MANCCCCCAN TCAGCSYTNA CNGCMACAGC CCGCCTTCTT CAMACCGCCA 4 80 

RTACCGGGWT CAACCGGCCS GTCAAACTCA ACAGGCGGNC AGGCCTCCCC CGGANSAAAG 54 0 

GTCTTACSCC NNYAANAAAA MAAGNTCTGT TTTCCCCCTC CASAASNAAA AANCCCCSGC 600 

CGGGCCTTCN NMMGGGTTTG GGGMANAHAA AARCNCCGGN GGAACGNATC CGAAAMCTCC 660 

CAAGTCNCMT TWAWAACYCN NNAACCCCCC ANTTTTGGGA AAGGNTCCCC NTTMYCCCCC 720 

TTTTASGKTS GGGMMYYCTY TAAAAAAATT CCCCAAAAAG CCCCGGGAAG GGTCMAMCTG 7 80 

GGNAAATTTC CAAMCCNWGK TTNTTYNGGT TMCGGGGGRA AATTYCNCTC CCYYNNNGGG 840 

CSSGSNNNAT TAYGGMSNMT TTTNNAAWTM NSGKKTSAMM YNNKCCMNNN SNNMSMANNK 900 

TNAMCXCCCN CCTCNGNGKY CSCYNCCCSG GNAGNGGRAS MKCCNANMAA AYASGNTTNK 960 

CGGAAMMCNN AATKGNNNSC CCGGASMCMN NNNMAAATMT CNCNKCNSNN AANRGMRACN 1020 

CCCNSNSGMN RRGAARMTNY YCCCCCGSKM GKGNKAAAAW GKYCCCCCCM AAAG 1074 



(2) INFORMATION FOR SEQ ID NO: 33 6: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1195 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
ID) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 336: 

MGNGNCNKNT MTACATCWTT CTGCACCSGG GNTCWANTGC GGCCGCAWKY TTGTCGASAG 60 

ATCTCGAAYT CGGCAMGAGG ACWCTCGCRA CGCCCCCACA NACTCTGGCG TGTGTACCCC 12 0 

ATTGNGCGCX TCACGCGCCC AYTGANCCAK TNCACTGGGG TGCCGTYCGC CKTGCGCGGC 18 0 

GGCCTCACGG CKCTSCWTCT RAAGGCWTGG CGCACCGCAT TCGGTTTTCT RAACGCTGGG 24 0 

AAAWTGGCCA GCCGTCTGGC TCATGGGNTC TACGCAACGC CNGCCCCCAA CRCTTTCTTA 3 00 

AATCCGGYCC NTCCTGANCS CTTTGAAYCC CGGGGSAAGA ACTGGTTGCS CNCGAYCTGC 3 60 

TCGAACTTRK TCNAAATCCC GCANAKTGTT TCNTAMGYCC CNCCGGAAGG NGAACCTACT 42 0 

TTCNGGWANG TCGGCNKCCG GCGCTTATCA STCCTGATCA ACGGGGAACT GGYKNNSTTG 48 0 

KGGGAAAAAG RRCCTCAATG MTYGGTCCKC GCTGCGKANC CGCSCCCTGK GYCGCNAATG 54 0 

GAAGGCSMAG GGTTAANGCC MTTYCNYCCR RSCCGTSTGA SGKWTTYCGG MGGANKAMNN 60 0 

NNKMAMWTTK TCRGNGGCCW ATSTSCCGGG CKSTTAKAGA ANACTYCCKW WCCGTNTYSC 66 0 

SAAAGNTKC3 GCGMGTTTTS SCCKMGANGN YCTGATTTSA GGGGGKYKCC CCCGGGGTYC 72 0 

CGAAWKWRKY CCYAGGGGGM GNYCSAGCSC CGMNNATNAG AGNAAGGKTT RYGSTSKNCC 780 

TYTNKGGACC WSCNNCWSAK ANAACNNKKT TGCSCCNTMS AGNKTNKGRT YCCNKTSTTC 84 0 

TAAGAGGAGC TATKMKCGCC CKTGGANGMM GAGWGMGCGC KYCCCSNKRT TCNTNGWAAA 90 0 
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TATKSAGMGG TKCCGMAGMK CCSCGTTTKT TKTGANAAMN MSMRKNKKTG CGMGYTCTSC 960 

GGGNTTTGTA GAGTAKTCGS CSCSSMWGAC WCSGMCMGNG AGKNKTNNTS YANTGARCGY 1020 

MNNSKTMKMT MSCSCGCGNA GGAGNGCCCC CSANGMSTGY NKGGNMSSNG ARAKGATGGS 1080 

GGCCNCGMNN MGMGGANMGA SANNGMGGMR GGGGGKTGKC TCKCSCCGNS CSANGRAGAA 114 0 

GKTCNGSCGC CGMGGKYGKT KTKTKNKTGG YSTCMSSMMM NAGAAAAGAG AGGGC 1195 

(2) INFORMATION FOR SEQ ID NO: 337: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 572 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 337: 

CCATCTGATC GTTGGCAACC AGCATCGCAG TGGGAACGAT GCCCTCATTC AGCATTTGCA 6 0 

TGGTTTGTTG AAAACCGGAC ATGGCACTCC AGTCGCCTTC CCGTTCCGCT ATCGGCTGAA 120 

TTTGATTGCG AGTGAGATAT TTATGCCAGC CAGCCAGACG CAGACGCGCC GAGACAGAAC 180 

TTAATGGGCC CGCTAACAGC GCGATTTGCT GGTGACCCAA TGCGACCAGA TGCTCCACGC 240 

CCAGTCGCGT ACCGTCTTCA TGGGAGAAAA TAATACTGTT GATGGGTGTC TGGTCAGAGA 3 00 

CATCAAGAAA TAACGCCGGA ACATTAGTGC AGGCAGCTTC CACAGCAATG GCATCCTGGT 3 60 

CATCCAGCGG ATAGTTAATG ATCAGCCCAC TGACGCGTTG CGCGAGAAGA TTGTGCACCG 420 

CCGCTTTACA GGCTTCGACG CCGCTTCGTT CTACCATCGA CACCACCACG CTGGCACCCA 4 80 

GTTGATCGGC GCGAGATTTA ATCGCCGCGA CAATTTGCGA CGGCGCGTGC AGGGCCAGAC 54 0 

TGGAGGTGGC AACGCCAATC AGCAACGACT GTTTGCCCGC CAGTTGTTGT GCCACGCGGT 60 0 

TGGGAATGTA ATTCAGCTCC GCCATCGCCG CTTCCACTTT TTCCCGCGTT TTCGCAGAAA 660 

CGTGGCTGGC CTGGTTCACC ACGCGGGAAA CGGTCTGATA AGAGACACCG GCATACTCTG 72 0 

CGACATCGTA TAACGTTACT GGTTTCACAT TCACCACCCT GAATTGACTC TCTTCCGGGC 78 0 

GCTATCATGC CATACCGCGA AAGGTTTTGC GCCATTCGAT GGTGTCCGGG ATCTCGACGC 84 0 

TCTCCCTTAT GCGACTCCTG CATTAGGAAG CAGCCCAGTA GTAGGTTGAG GCCGTTGAGC 90 0 

ACCGCCGCCG CAAGGAATGG TGCATGCAAG GAGATGGCGC CCAACAGTCC CCCGGCCACG 960 

GGGCCTGCCA CCATACCCAC GCCGAAACAA GCGCTCATGA GCCCGAAGTG GCGAGCCCGA 102 0 

TCTTCCCCAT CGGTGATGTC GGCGATATAG GCGCCAGCAA CCGCACCTGT GGCGCCGGTG 108 0 

ATGCCGGCCA CGATGCGTCC GGCGTAGAGG ATCGAGATCT CGATCCCGCG AAATTAATPlC 114 0 

GACTCACTAT AGGGGAATTG TGAGCGGATA ACAATTCCCC TCTAGAAATA ATTTTGTTTA 1200 

ACTTTAAGAA GGAGATATAC ATATGGGCCA TCATCATCAT CATCACGTGA TCGACATCAT 126 0 

CGGGACCAGC CCCACATCCT GGGAACAGGC GGCGGCGGAG GCGGTCCAGC GGGCGCGGGA 132 0 

TAGCGTCGAT GACATCCGCG TCGCTCGGGT CATTGAGCAG GACATGGCCG TGGACAGCGC 13 8 0 

CGGCAAGATC ACCTACCGC^ TCAAGCTCGA AGTGTCGTTC AAGATGAGGC CGGCGCAACC 144 0 

GAGGGGCTCG AAACCACCGA GCGGTTCGCC TGAAACGGGC GCCGGCGCCG GTACTGTCGC 1500 

GACTACCCCC GCGTCGTCGC CGGTGACGTT GGCGGAGACC GGTAGCACGC TGCTCTACCC 1560 

GCTGTTCAAC CTGTGGGGTC CGGCCTTTCA CGAGAGGTAT CCGAACGTCA CGATCACCGC 162 0 

TCAGGGCACC GGTTCTGGTG CCGGGATCGC GCAGGCCGCC GCCGGGACGG TCAACATTGG 168 0 

GGCCTCCGAC GCCTATCTGT CGGAAGGTGA TATGGCCGCG CACAAGGGGC TGATGAACAT 174 0 

CGCGCTAGCC ATCTCCGCTC AGCAGGTCAA CTACAACCTG CCCGGAGTGA GCGAGCACCT 180 0 

CAAGCTGAAC GGAAAAGTCC TGGCGGCCAT GTACCAGGGC ACCATCAAAA CCTGGGACGA 1860 

CCCGCAGATC GCTGCGCTCA ACCCCGGCGT GAACCTGCCC GGCACCGCGG TAGTTCCGCT 192 0 

GCACCGCTCC GACGGGTCCG GTGACACCTT CTTGTTCACC CAGTACCTGT CCAAGCAAGA 1980 

TCCCGAGGGC TGGGGCAAGT CGCCCGGCTT CGGCACCACC GTCGACTTCC CGGCGGTGCC 2 04 0 

GGGTGCGCTG GGTGAGAACG GCAACGGCGG CATGGTGACC GGTTGCGCCG AGACACCGGG 210 0 

CTGCGTGGCC TATATCGGCA TC\GCTTCCT CGACCAGGCC AGTCAACGGG GACTCGGCGA 215 0 

GGCCCAACTA GGCAATAGCT CTTGGCAATTT CTTGTTGCCC GACGCGCAAA GCATTCAGGC 222 0 
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CGCGGCGGCT GGCTTCGCAT CGAAAACCCC GGCGAACCAG GCGATTTCGA TGATCGACGG 22 80 

GCCCGCCCCG GACGGCTACC CGATCATCAA CTACGAGTAC GCCATCGTCA ACAACCGGCA 234 0 

AAAGGACGCC GCCACCGCGC AGACCTTGCA GGCATTTCTG CACTGGGCGA TCACCGACGG 2400 

CAACAAGGCC TCGTTCCTCG ACCAGGTTCA TTTCCAGCCG CTGCCGCCCG CGGTGGTGAA 2460 

GTTGTCTGAC GCGTTGATCG CGACGATTTC CAGCGCTGAG ATGAAGACCG ATGCCGCTAC 2520 

CCTCGCGCAG GAGGCAGGTA ATTTCGAGCG GATCTCCGGC GACCTGAAAA CCCAGATCGA 2580 

CCAGGTGGAG TCGACGGCAG GTTCGTTGCA GGGCCAGTGG CGCGGCGCGG CGGGGACGGC 2640 

CGCCCAGGCC GCGGTGGTGC GCTTCCAAGA AGCAGCCAAT AAGCAGAAGC AGGAACTCGA 27 00 

CGAGATCTCG ACGAATATTC GTCAGGCCGG CGTCCAATAC TCGAGGGCCG ACGAGGAGCA 2760 

GGAGCAGGCG CTGTCCTCGC AAATGGGCTT TGGATTCAGC TTCGCGCTGC CTGCTGGCTG 2820 

GGTGGAGTCT GACGCCGCCC ACTTCGACTA CGGTTCAGCA CTCCTCAGCA AAACCACCGG 2880 

GGACCCGCCA TTTCCCGGAC AGCCGCCGCC GGTGGCCAAT GACACCCGTA TCGTGCTCGG 2940 

CCGGCTAGAC CAAAAGCTTT ACGCCAGCGC CGAAGCCACC GACTCCAAGG CCGCGGCCCG 3000 

GTTGGGCTCG GACATGGGTG AGTTCTATAT GCCCTACCCG GGCACCCGGA TCAACCAGGA 3 060 

AACCGTCTCG CTYGACGCCA ACGGGGTGTC TGGAAGCGCG TCGTATTACG AAGTCAAGTT 3120 

CAGCGATCCG AGTAAGCCGA ACGGCCAGAT CTGGACGGGC GTAATCGGCT CGCCCGCGGC 3180 

GAACGCACCG GACGCCGGGC CCCCTCAGCG CTGGTTTGTG GTATGGCTCG GGACCGCCAA 3 24 0 

CAACCCGGTG GACAAGGGCG CGGCCAAGGC GCTGGCCGAA TCGATCCGGC CTTTGGTCGC 33 00 

CCCGCCGCCG GCGCCGGCCG GGGAAGTCGC TCCTACCCCG ACGACACCGA CACCGCAGCG 3 3 60 

GACCTTACCG GCCTGAGAAT TCTGCAGATA TCCATCACAC TGGCGGCCGC TCGAGCACCA 3420 

CCACCACCAC CACTGAGATC CGGCTGCTAA CAAAGCCCGA AAGGAAGCTG AGTTGGCTGC 34 8 0 

TGCCACCGCT GAGCAATAAC TAGCATAACC CCTTGGGGCC TCTAAACGGG TCTTGAGGGG 3 54 0 

TTTTTTGCTG AAAGGAGGAA CTATATCCGG AT 3572 

(2) INFORMATION FOR SEQ ID NO: 338: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

{XI) SEQUENCE DESCRIPTION: SEQ ID NO: 3 38: 

Val Gin Phe Gin Ser Gly Gly Asp Asn Ser Pro Ala Val Tyr Xaa Xaa 
^ 10 15 

Asp Gly Xaa Arg 
20 

(2) INFORMATION FOR SEQ ID NO: 3 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33 9: 



Thr Thr Val Pro Xaa Val Thr Glu Ala Arg 

5 10 
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(2) INFORMATION FOR SEQ ID NO: 34 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 340: 
Thr Thr Pro Ser Xaa Val Ala Phe Ala Arg 

(2) INFORMATION FOR SEQ ID NO: 341: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 341: 

Asp Ala Gly Lys Xaa Ala Gly Xaa Asp Val Xaa Arg 

10 

(2) INFORMATION FOR SEQ ID NO:342: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:342: 

Thr xaa Glu Glu Xaa Gin Glu Ser Phe Asn Ser Ala Ala Pro Gly Asn 

10 15 

Xaa Lys 



(2) INFORMATION FOR SEQ ID NO:343: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: Other 

ixi) SEQUENCE DESCRIPTION: SEQ ID KO:343: 
CTAGTTAGTA CTCAGTCGCA GACCGTG 

(2) INFORMATION FOR SEQ ID NO: 344: 

(i) SEQUENCE CHAiiACTERISTICS : 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ing 1 e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 344: 
GCAGTGACGA ATTCACTTCG ACTCC 

(2) INFORMATION FOR SEQ ID NO: 345: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 412 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLSCJLE TYPE: cDNA 

(Xi} SEQUENCE DESCRIPTION: SEQ ID NO: 345: 




60 
120 
180 
240 
300 
360 
420 
480 
540, 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



132C 
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^JIEfS^^ GGATCTCCGG CGACCTGAAA ACCCAGATCG ACCAGGTGGA GTCGACGGCA 

^nJ^I™ AAGCAGCCAA TAAGCAGAAG CAGGAACTCG ACGAGATCTC GACQAaSS 
CGTCAGGCCG GCGTCCAATA CTCGAGGGCC GACGAGGAGC AGCAGCAGGC G^SJ^SS 

^ITcZ JSSgS^ ^^^^^^^'^^^ l^c^afc 

CACCTGTTGC CCCCCCACCA CCGGCCGCCG CCAACACGCC OAATGCCCAG 

ccgggcgatc ccaacgcagc acctccgccg gccgacccga acgcScgcc SSSSS 

ACGCAGCCCA ACCTGTCCGG ATCGACAACC JSSSS A^S^S^ 
GC3CTGCCTG CTGGCTGGGT GGAGTCTGAC GCCGCCCACT TCGAcScGG JS^SS 
Z'^C^ ""^^^ CCCGCCATTT CCCGGACAGC CgS^JS^ S^SSS^ 
JcSISS cSrS^S AAGCTTTACG CCAGCGCCGA S^S^ 

TCCAAGGCCG CGGCCCGGTT GGGCTCGGAC ATGGGTGAGT TCTATATGCC CTArrrrJ^ 
J^^^'^"' ACCAGGAAAC CGTCTCGCTC GACGCCAACG ^Zt^ SSSS 

™^ cSJSo'^^ cg'L'o'°"'" "^^^^'^^^ ^^^^'^^^ 

JCGC^SS^ -T^^l ™- SSS?- ~A ..0 



(2) INFORMATION FOR SEQ ID NO: 346: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 302 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 346: 
Mer Gly Hxs His His Has Has His Val lie Asp He He Gly Thr Ser 
Pro Thr ser Trp Glu Gin Ala Ala Ala gL .Ala Val Gin Arg Ala Arg 
ASP ser val .^p Asp He Arg Val Ma Arg Val lie Glu Oln Asp Met 



40 45 



Ala val ASP Ser Ala Gly Lys He Thr Tyr Arg He Lys Leu Glu Val 

^ ^ 60 
se. Phe Lys Met Arg Pro Ala Gin Pro Arg Gly Ser Lys Pro Pro Ser 

Gly ser Pro Glu Thr Gl^ Ala Gly Ala Gly ^hr Val Ala Thr Thr Pro 

Ala ser Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu ^u Tyr 

Pro Leu Phe Asn Leu Trp Gly Pro III Phe His Glu Arg Pro Asn 

12 0 1 9 ^ 

val Thr lie Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly lie Ala Gin 

Ala Ala Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser 

Glu Gly ASP Met Ala Ala His Lys Gly Leu III Asn lie Ala Leu HI 

Ser Ala Gin G^" Va' Aqn ts.^ a . 

x.i ya. Asn Tyr Asn Leu Pro Gly Val Ser Glu Hxs 



1380 
1440 
1500 
1560 
1620 
1580 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 



2412 
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180 

Leu Lys Leu Asn 
195 

Lys Thr Trp Asp 
210 

Leu Pro Gly Thr 
225 

Asp Thr Phe Leu 

Trp Gly Lys Ser 
260 

Pro Gly Ala Leu 
275 

Ala Glu Thr Pro 
290 

Gin Ala Ser Gin 

305 

Gly Asn Phe Leu 

Gly Phe Ala Ser 
340 

Gly Pro Ala Pro 
355 

Val Asn Asn Arg 
370 

Phe Leu His Trp 
385 

Gin Val His Phe 

Ala Leu He Ala 
420 

Thr Leu Ala Gin 
435 

Lys Thr Gin lie 
450 

Gin Trp Arg Gly 
465 

?he Gin Glu Ala 

Thr Asn lie Arg 
500 

Gin Gin Gin Ala 
515 

Ala Ser Pro Pro 
530 

Val Ala Pro Pro 
545 

Gly Asp Pro Asn 

Pro Pro Val lie 
580 

Pro Val Gly Gly 
595 

Asp Ala Ala His 

610 



Gly Lys Val Leu 
200 

Asp Pro Gin He 
215 

Ala Val Val Pro 
230 

Phe Thr Gin Tyr 
245 

Pro Gly Phe Gly 

Gly Glu Asn Gly 
280 

Gly Cys Val Ala 
295 

Arg Gly Leu Gly 

310 

Leu Pro Asp Ala 
325 

Lys Thr Pro Ala 

Asp Gly Tyr Pro 
360 

Gin Lys Asp Ala 
375 

Ala He Thr Asp 
390 

Gin Pro Leu Pro 
405 

Thr He Ser Ser 

Glu Ala Gly Asn 
440 

Asp Gin Val Glu 
455 

Ala Ala Gly Thr 
470 

Ala Asn Lys Gin 
485 

Gin Ala Gly Val 

Leu Ser Ser Gin 
520 

Ser Thr Ala Ala 
535 

Pro Pro Ala Ala 

550 

Ala Ala Pro Pro 
565 

Ala Pro Asn Ala 

Phe Ser Phe Ala 
600 

Phe Asp Tyr Gly 
515 



185 

Ala Ala Met Tyr 

Ala Ala Leu Asn 

220 

Leu His Arg Ser 
235 

Leu Ser Lys Gin 
250 

Thr Thr Val Asp 
265 

Asn Gly Gly Met 

Tyr He Gly He 
300 

Glu Ala Gin Leu 
315 

Gin Ser He Gin 
330 

Asn Gin Ala He 
345 

lie He Asn Tyr 

Ala Thr Ala Gin 
380 

Gly Asn Lys Ala 
395 

Pro Ala Val Val 
410 

Ala Glu Met Lys 
425 

Phe Glu Arg He 

Ser Thr Ala Gly 
460 

Ala Ala Gin Ala 
475 

Lys Gin Glu Leu 
490 

Gin Tyr Ser .\rg 
505 

Met Gly Phe Val 

Ala Pro Pro Ala 
540 

Ala Asn Thr Pro 
555 

Pro Ala Asp Pro 
570 

Pro Gin Pro Val 
585 

Leu Pro Ala Gly 

Ser Ala Leu Leu 

620 



190 

Gin Gly Thr He 
205 

Pro Gly Val Asn 

Asp Gly Ser Gly 
240 

Asp Pro Glu Gly 
255 

Phe Pro Ala Val 

270 

Val Thr Gly Cys 
285 

Ser Phe Leu Asp 

Gly Asn Ser Ser 
320 

Ala Ala Ala Ala 
335 

Ser Met He Asp 
350 

Glu Tyr Ala lie 
365 

Thr Leu Gin Ala 

Ser Phe Leu Asp 
400 

Lys Leu Ser Asp 
415 

Thr Asp Ala Ala 
430 

Ser Gly Asp Leu 
445 

Ser Leu Gin Gly 

Ala Val Val Arg 
480 

Asp Glu He Ser 
495 

Ala Asp Glu Glu 
510 

Pro Thr Thr .Ua 
525 

Pro Ala Thr Pro 

Asn Ala Gin Pro 
560 

Asn Ala Pro Pro 
575 

Arg He Asp Asn 

590 

Trp Val Glu Ser 
605 

Ser Lys Thr Thr 
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Gly Asp Pro Pro Phe Pro 
S2S 630 
Arg He Val Leu Gly Arg 
645 

Ala Thr Asp Ser Lys Ala 
660 

Phe Tyr Met Pro Tyr Pro 
675 

Leu Asp Ala Asn Gly Val 
690 

Phe Ser Asp Pro Ser Lys 
705 710 
Gly Ser Pro Ala Ala Asn 
725 

Phe Val Val Trp Leu Gly 
740 

Ala Lys Ala Leu Ala Glu 
755 

Ala Pro Ala Pro Ala Pro 
770 

Gly Glu Val Ala Pro Thr 
785 790 
Pro Ala 



Gly Gin Pro 

Leu Asp Gin 

Ala Ala Arg 
665 

Gly Thr Arg 
680 

Ser Gly Ser 
695 

Pro Asn Gly 

Ala Pro Asp 

Thr Ala Asn 
745 

Ser He Arg 
760 

Ala Glu Pro 
775 

Pro Thr Thr 



Pro Pro 
635 
Lys Leu 
650 

Leu Gly 

He Asn 

Ala Ser 

Gin He 
715 
Ala Gly 
730 

Asn Pro 

Pro Leu 

Ala Pro 

Pro Thr 
795 



Val Ala Asn 

Tyr Ala Ser 

Ser Asp Met 
670 

Gin Glu Thr 
685 

Tyr Tyr Glu 
700 

Trp Thr Gly 

Pro Pro Gin 

Val Asp Lys 
750 

Val Ala Pro 
765 

Ala Pro Ala 
780 

Pro Gin Arg 



Asp Thr 
640 
Ala Glu 
655 

Gly Glu 

Val Ser 

Val Lys 

Val He 
720 
Arg Trp 
735 

Gly Ala 
Pro Pro 
Pro Ala 

Thr Leu 

800 



(2) INFORMATION FOR SEQ ID NO: 347: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECJLE TYPE: Other 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 347: 
GGATCCAAAC CACCGAGCGG TTCGCCTGAA ACGG 

(2) INFORMATION FOR SEQ ID NO: 348: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO :34a: 
CGCTGCGAAT TCACCTCCGG AGGAAATCGT CGCGATC 

(2) INFORMATION FOR SEQ ID NO: 34 9: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1962 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 349: 



CATATGGGCC 


ATCATCATCA 


TCATCACGGA 


TCCAAACCAC 


CGAGCGGTTC 


GCCTGAAACG 


60 


GGCGCCGGCG 


CCGGTACTGT 


CGCGACTACC 


CCCGCGTCGT 


CGCCGGTGAC 


GTTGGCGGAG 


120 


ACCGGTAGCA 


CGCTGCTCTA 


CCCGCTGTTC 


AACCTGTGGG 


GTCCGGCCTT 


TCACGAGAGG 


180 


TATCCGAAC3 


TCACGATCAC 


CGCTCAGGGC 


ACCGGTTCTG 


GTGCCGGGAT 


CGCGCAGGCC 


240 


GCCGCCGGGA 


CGGTCAACAT 


TGGGGCCTCC 


GACGCCTATC 


TGTCGGAAGG 


TGATATGGCC 


300 


GCGCACAAGG 


GGCTGATGAA 


CATCGCGCTA 


GCCATCTCCG 


CTCAGCAGGT 


CAACTACAAC 


360 


CTGCCC3GAG 


TGAGCGAGCA 


CCTCAAGCTG 


AACGGAAAAG 


TCCTGGCGGC 


CATGTACCAG 


420 


GGCACCATCA 


.^^CCTGGGA 


CGACCCGCAG 


ATCGCTGCGC 


TCAACCCCGG 


CGTGAACCTG 


480 


CCCGGCACCG 


CGGTAGTTCC 


GCTGCACCGC 


TCCGACGGGT 


CCGGTGACAC 


CTTCTTGTTC 


540 


ACCCAGTACC 


TGTCCAAGCA 


AGATCCCGAG 


GGCTGGGGCA 


AGTCGCCCGG 


CTTCGGCACC 


600 


ACCGTCGACT 


TCCCGGCGGT 


GCCGGGTGCG 


CTGGGTGAGA 


ACGGCAACGG 


CGGCATGGTG 


660 


ACCGGTTGCG 


CCGAGACACC 


GGGCTGCGTG 


GCCTATATCG 


GCATCAGCTT 


CCTCGACCAG 


720 


GCCAGTCAAC 


GGGGACTCGG 


CGAGGCCCAA 


CTAGGCAATA 


GCTCTGGCAA 


TTTCTTGTTG 


780 


CCCGACGCGC 


AAAGCATTCA 


GGCCGCGGCG 


GCTGGCTTCG 


CATCGAAAAC 


CCCGGCGAAC 


840 


CAGGCGATTT 


CGATGATCGA 


CGGGCCCGCC 


CCGGACGGCT 


ACCCGATCAT 


CAACTACGAG 


900 


TACGCCATCG 


TCAACAACCG 


GCAAAAGGAC 


GCCGCCACCG 


CGCAGACCTT 


GCAGGCATTT 


960 


CTGCACTGGG 


C GAT CAC C G A 


CGGCAACAAG 


GCCTCGTTCC 


TCGACCAGGT 


TCATTTCCAG 


1020 


CCGCTGCCGC 


CCGCGGTGGT 


GAAGTTGTCT 


GACGCGTTGA 


TCGCGACGAT 


TTCCTCCGGA 


1080 


GGTGGCAGTG 


GGGGAGGCTC 


AGGTGGAGGT 


TCTGGCGGGA 


GCGTGCCCAC 


AACGGCCGCC 


1140 


TCGCCGCCGT 


CGACCGCTGC 


AGCGCCACCC 


GCACCGGCGA 


CACCTGTTGC 


CCCCCCACCA 


1200 


CCGGCCGCCG 


CCAACACGCC 


3AATGCCCAG 


CCGGGCGATC 


CCAACGCAGC 


ACCTCCGCCG 


1260 


GCCGACCCGA 


ACGCACCGCC 


GC CACCTGTC 


ATTGCCCCAA 


ACGCACCCCA 


ACCTGTCCGG 


1320 


ATCGACAACC 


CGGTTGGAGG 


^•T^/^ AG ^"""^^ 


GCGCTGCCTG 


CTGGCTGGGT 


GGAGTCTGAC 


1380 


GCCGCCCACT 




TTCAGCACTC 


CTCAGCAAAA 


CCACCGGGGA 


CCCGCCATTT 


1440 


CCCGGACAGC 


CGCCGCCGGT 


GGCCAATGAC 


ACCCGTATCG 


TGCTCGGCCG 


GCTAGACCAA 


1500 


AAGCTTTACG 


CCAGCGCCGA 


AGCCACCGAC 


TCCAAGGCCG 


CGGCCCGGTT 


GGGCTCGGAC 


1560 


ATGGGTGAGT 


TCTATATGCC 


CTACCCGGGC 


ACCCGGATCA 


ACCAGGAAAC 


CGTCTCGCTC 


1620 


GACGCCAACG 


GGGTGTCTGG 


AAGCGCGTCG 


TATTACGAAG 


TCAAGTTCAG 


CGATCCGAGT 


1680 


.AAGCCGAACG 


GCCAGATCTG 


GACGGGCGTA 


ATCGGCTCGC 


CCGCGGCGAA 


CGCACCGGAC 


1740 


GCCGGGCCCC 


CTCAGCGCTG 


GTTTGTGGTA 


TGGCTCGGGA 


CCGCCAACAA 


CCCGGTGGAC 


1800 


AAGGGCGCGG 


CCAAGGCGCT 


GGCCGAATCG 


ATCCGGCCTT 


TGGTCGCCCC 


GGCGCCGGCG 


1860 


CCGGCACCGG 


CTCCTGCAGA 


GCCCGCTCCG 


GCGCCGGCGC 


CGGCCGGGGA 


AGTCGCTCCT 


1920 


ACCCCGACGA 


CACCGACACC 


GCAGCGGACC 


TTACCGGCCT 


GA 




1962 



(2) INFORMATION FOR SEQ ID NO: 350: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: £52 ammo acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS; Single 

(D) TOPOLOGY: Imear 

(ii) MOLECULE TYPE: procein 

(XI) 3EQUENCI DESCRIPTION: SHQ ID NO: 350: 
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Met Gly His His 
1 

Pro Glu Thr Gly 
20 

Ser Pro Val Thr 
35 

Phe Asn Leu Trp 
50 

lie Thr Ala Gin 
65 

Ala Gly Thr Val 

Asp Met Ala Ala 
100 

Ala Gin Gin Val 
115 

Leu Asn Gly Lys 
130 

Trp Asp Asp Pro 

14-5 

Gly Thr Ala Val 

Phe Leu Phe Thr 
180 

Lys Ser Pro Gly 
195 

Ala Leu Gly Glu 
210 

Thr Pro Gly Cys 
225 

Ser Gin Arg Gly 

Phe Leu Leu Pro 
260 

Ala Ser Lys Thr 
275 

Ala Pro Asp Gly 
290 

Asn Arg Gin Lys 
305 

His Trp Ala lie 

His Phe Gin Pro 
340 

He Ala Thr lie 
355 

Gly Ser Gly Gly 
370 

Ala Ala Ala Pro 
385 

Ala Ala Ala Asn 

Pro Pro Pro Ala 
420 



His His His His 
5 

Ala Gly Ala Gly 

Leu Ala Glu Thr 
40 

Gly Pro Ala Phe 
55 

Gly Thr Gly Ser 
70 

Asn He Gly Ala 
85 

His Lys Gly Leu 

Asn Tyr Asn Leu 

120 

Val Leu Ala Ala 
135 

Gin He Ala Ala 
150 

Val Pro Leu His 
165 

Gin Tyr Leu Ser 

Phe Gly Thr Thr 
200 

Asn Gly Asn Gly 
215 

Val Ala Tyr He 
230 

Leu Gly Glu Ala 
245 

Asp Ala Gin Ser 

Pro Ala Asn Gin 
280 

Tyr Pro He He 
295 

Asp Ala Ala Thr 
310 

Thr Asp Gly Asn 
325 

Leu Pro Pro Ala 

Ser Ser Gly Gly 
360 

Ser Vai Pro Thr 
375 

Pro Ala Pro Ala 
390 

Thr Pro Asn Ala 
405 

Asp Pro Asn Ala 



Gly Ser Lys Pro 
10 

Thr Val Ala Thr 
25 

Gly Ser Thr Leu 

His Glu Arg Tyr 
60 

Gly Ala Gly He 
75 

Ser Asp Ala Tyr 
90 

Met Asn He Ala 
105 

Pro Gly Val Ser 

Met Tyr Gin Gly 
140 

Leu Asn Pro Gly 
155 

Arg Ser Asp Gly 
170 

Lys Gin Asp Pro 
185 

Val Asp Phe Pro 

Gly Met Val Thr 
220 

Gly He Ser Phe 
235 

Gin Leu Gly Asn 
250 

He Gin Ala Ala 
265 

Ala He Ser Met 

Asn Tyr Glu Tyr 
300 

Ala Gin Thr Leu 
315 

Lys Ala Ser Phe 
330 

Val Val Lys Leu 
345 

Gly Ser Gly Gly 

Thr Ala Ala Ser 
380 

Thr Pro Vai Ala 
395 

Gin Pro Gly Asp 
410 

Pro Pro Pro Pro 
425 



Pro Ser Gly Ser 
15 

Thr Pro Ala Ser 
30 

Leu Tyr Pro Leu 
45 

Pro Asn Val Thr 

Ala Gin Ala Ala 
80 

Leu Ser Glu Gly 
95 

Leu Ala He Ser 
110 

Glu His Leu Lys 
125 

Thr He Lys Thr 

Val Asn Leu Pro 
160 

Ser Gly Asp Thr 
175 

Glu Gly Trp Gly 
190 

Ala Val Pro Gly 
205 

Gly Cys Ala Glu 

Leu Asp Gin Ala 
240 

Ser Ser Gly Asn 
255 

Ala Ala Gly Phe 
270 

He Asp Gly Pro 
285 

Ala He Val Asn 

Gin Ala Phe Leu 
320 

Leu Asp Gin Val 
335 

Ser Asp Ala Leu 
350 

Gly Ser Gly Gly 
365 

Pro Pro Ser Thr 

Pro Pro Pro Pro 
400 

Pro Asn Ala Ala 
415 

Val He Ala Pro 
430 
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Asn Ala Pro Gin 
435 

Phe Ala Leu Pro 
450 

Tyr Gly Ser Ala 
465 

Gly Gin Pro Pro 

Leu Asp Gin Lys 
500 

Ala Ala Arg Leu 
515 

Gly Thr Arg He 
530 

Ser Gly Ser Ala 
545 

Pro Asn Gly Gin 

Ala Pro Asp Ala 
580 

Thr Ala Asn Asn 
595 

Ser lie Arg Pro 
610 

Ala Glu Pro Ala 
S25 

Pro Thr Thr Pro 



Pro Val 

Ala Gly 

Leu Leu 
470 
Pro Val 
485 

Leu Tyr 

Gly Ser 

Asn Gin 

Ser Tyr 
550 
:ie Trp 
565 

Gly Pro 

Pro Val 

Leu Val 

Pro Ala 
630 
Thr Pro 
64 5 



Arg He 
440 

Trp Val 
455 

Ser Lys 

Ala Asn 

Ala Ser 

Asp Met 
520 
Glu Thr 
535 

Tyr Glu 

Thr Gly 

Pro Gin 

Asp Lys 
600 
Ala Pro 
615 

Pro Ala 
Gin Arg 



Asp Asn Pro Val 
Glu Ser 
Thr Thr 



Asp Thr 
490 
Ala Glu 
505 

Gly Glu 

Val Ser 

Val Lys 

Val He 
570 
Arg Trp 
585 

Gly Ala 

Pro Pro 

Pro Ala 

Thr Leu 
650 



Asp Ala 
460 
Gly Asp 
475 

Arg lie 



Ala Thr 

Phe Tyr 

Leu Asp 
540 
Phe Ser 
555 

Gly Ser 

Phe Val 

Ala Lys 

Ala Pro 
620 
Gly Glu 
635 

Pro Ala 



Gly Gly 
445 

Ala His 

Pro Pro 

Val Leu 

Asp Ser 
510 
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Pro Thr 
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